SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT : Takanori OKURA 
Kakuji TORIGOE 
Ma s ah i KUR I MOTO 

(ii) TITLE OF INVENTION: GENOMIC DNA ENCODING A POLYPEPTIDE CAPABLE OF 

INDUCING THE PRODUCTION OF INTERFERON- y 

(iii) NUMBER OF SEQUENCES: 35 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : B ROWDY AND NEIMARK 

(B) STREET: 419 Seventh Street, N.W., Suite 300 

(C) CITY: Washington 

(D) STATE: D.C. 

(E) COUNTRY: USA 

(F) ZIP: 20004 

(v) COMPUTER READA3LE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC - DOS /MS - DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.30 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUM3ER : JP 185,305/36 

(B) FILING DATE: 27- l JUN-1996 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: B ROWDY , R^ger L. 

(B) REGISTRATION I TOMBS R : 25,618 

(C) REFERENCE/ DOCKET NUMBER: OKURA=l 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 202-628-5197 

(B) TELEFAX: 202-737-3528 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 157 amino acids 

(B) TYPE: amino arid 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 1: 



Tyr 


Phe 


Gly 


Lys 


Leu 


Glu 


Ser 


Lys 


Leu 


Ser 


Val 


He 


Arg 


Asn 


Leu 


Asn 


1 








5 










10 










15 




Asp 


Gin 


Val 


Leu 


Phe 


He 


Asp 


Gin 


Gly 


Asn 


Arg 


Pro 


Leu 


Phe 


Glu 


Asp 






20 










25 










30 






Met 


Thr 


Asp 


Ser 


Asp 


Cys 


Arg 


Asp 


Asn 


Ala 


Pro 


Arg 


Thr 


He 


Phe 


He 






35 










4 0~ 










45 








lie 


Ser 


Met 


Tyr 


Lys 


Asp 


Ser 


Gin 


Pro 


Arg 


Gly 


Met 


Ala 


Val 


Thr 


He 




50 










55 










60 










Ser 


Val 


Lys 


Cys 


Glu 


Lys 


He 


Ser 


Xaa 


Leu 


Ser 


Cys 


Glu 


Asn 


Lys 


He 


65 










70 










75 










80 


He 


Ser 


Phe 


Lys 


Glu 


Met 


Asn 


Pro 


Pro 


Asp 


Asn 


He 


Lys 


Asp 


Thr 


Lys 










85 










90 










95 




Ser 


Asp 


He 


He 


Phe 


Phe 


Gin 


Arg 


Ser 


Val 


Pro 


Gly 


His 


Asp 


Asn 


Lys 








100 










105 










110 






Met 


Gin 


Phe 


Glu 


Ser 


Ser 


Ser 


Tyr 


Glu 


Gly 


Tyr 


Phe 


Leu 


Ala 


Cys 


Glu 



115 120 125 
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Lys Glu Arg Asp Leu Phe Lys Leu lie Leu Lys Lys Glu Asp Glu Leu 

130 ' 135 140 

Gly Asp Arg Ser lie Met: Phe Thr Val Gin Asn Glu Asp 

145 * 150 155 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1120 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 



(iii) HYPOTHETICAL: No 

(iv) ANT I - SENSE : No 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM : human 
(F) TISSUE TYPE : liver 



(iX) FEATURE: 

( A ) NAME / HE Y : 5 ' UTR 

(3) LOCATION: 1 . .17 7 

(C) IDENTIFICATION METHODS : E 

(A) NAME /KEY: leader peptide 

(3) LOCATION: 178.. 285 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : mat peptide 

(3) LOCATION: 286.. 756 

(C) IDENTIFICATION METHODS: S 

( A ) NAME / KEY : 3 1 UTR 

(3) LOCATION: 757 . . 1120 

(C) IDENTIFICATION METHODS: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

G2CTGGACAG TCAGCAAGGA ATTGTCTCCC AGTGCATTTT GCCCTCCTGG CTGCCAACTC 6 0 
TGGCTGCTAA AGCGGCTGCC ACCTGCTGCA GTCTACACAG CTTCGGGAAG AGGAAAGGAA 12 0 
CCTCAGACCT TCCAGATCGC TTCCTCTCGC AACAAACTAT TTGTCGCAGG AATAAAG 17 7 



ATG 


GCT 


GCT 


GAA 


CCA 


GTA 


GAA 


GAC 


AAT 


TGC 


ATC 


AAC 


TTT 


GTG 


GCA 


ATG 


225 


Met 


Ala 
-35 


Ala 


Glu 


Pro 


Val 


Glu 

-30 


Asp 


Asn 


Cys 


He 


Asn 
-25 


Phe 


Val 


Ala 


Met 




AAA 


TTT 


ATT 


g a : 


AAT 


ACG 


CTT 


TAC 


TTT 


ATA 


GCT 


GAA 


GAT 


GAT 


GAA 


AAC 


273 


Lys 


Phe 


He 


Asp 


Asn 


Thr 


Leu 


Tyr 


Phe 


He 


Ala 


Glu 


Asp 


Asp 


Glu 


Asn 




-20 










- 15 










-10 










- 5 




CTG 


GAA 


TCA 


GAT 


TAC 


TTT 


GGC 


AAC 


CTT 


GAA 


TCT 


AAA 


TTA 


TCA 


GTC 


ATA 


321 


Leu 


Glu 


Ser 


Asp 


Tyr 


Phe 


Gly 


Lys 


Leu 


Glu 


Ser 


Lys 


Leu 


Ser 


Val 


He 










1 








5 










10 








AG A 


AAT 


TTG 


AAT 


GAC 


CAA 


GTT 


CTC 


TTC 


ATT 


GAC 


CAA 


GGA 


AAT 


CGG 


CCT 


369 


Arg 


Asn 


Leu 
15 


Asn 


Asp 


Gin 


Val 


Leu 
20 


Phe 


He 


Asp 


Gin 


Gly 
25 


Asn 


Arg 


Pro 




CTA 


TTT 


GAA 


GAT 


ATG 


ACT 


GAT 


TCT 


GAC 


TGT 


AGA 


GAT 


AAT 


GCA 


ccc 


CGG 


417 


Leu 


Phe 
30 


Glu 


Asp 


Met 


Thr 


Asp 
35' 


Ser 


Asp 


Cys 


Arg 


Asp 
4 0' 


Asn 


Ala 


Pro 


Arg 




ACC 


ATA 


TTT 


ATT 


ATA 


AGT 


ATG 


TAT 


AAA 


GAT 


AGC 


CAG 


CCT 


AGA 


GGT 


ATG 


465 


Thr 


He 


Phe 


He 


He 


Ser 


Met 


Tyr 


Lys 


Asp 


Ser 


Gin 


Pro 


Arg 


Gly 


Met 




45 










50 










55 










60 




GCT 


GTA 


ACT 


ATC 


TCT 


GTG 


AAG 


TGT 


GAG 


AAA 


ATT 


TCA 


AYT 


CTC 


TCC 


TGT 


513 


Ala 


Val 


Thr 


He 


Ser 


Val 


Lys 


Cys 


Glu 


Lys 


He 


Ser 


Xaa 


Leu 


Ser 


Cys 












65 








70 










75 






GAG 


AAC 


AAA 


ATT 


ATT 


TCC 


TTT 


AAG 


GAA 


ATG 


AAT 


CCT 


CCT 


GAT 


AAC 


ATC 


561 


Glu 


Asn 


Lys 


He 


He 


Ser 


Phe 


Lys 


Glu 


Met 


Asn 


Pro 


Pro 


Asp 


Asn 


He 








80 










85 










90 








AAG 


GAT 


ACA 


AAA 


AGT 


GAC 


ATC 


ATA 


TTC 


TTT 


CAG 


AGA 


AGT 


GTC 


CCA 


GGA 


609 
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Lys Asp Thr Lys Ser Asp lie He Phe Phe Gin Arg Ser Val Pro Gly 
95 ' 100 105 



CAT 


GAT 


AAT 


AAG 


ATG 


CAA 


TTT 


GAA 


TCT 


TCA 


TCA 


TAG 


GAA 


GGA 


TAC 


TTT 


His 


Asp 
110 


Asn 


Ly s 


Met 


Gin 


Phe 
115 


Glu 


Ser 


Ser 


Ser 


Tyr 
120 


Glu 


Gly 


Tyr 


Phe 


CTA 


GCT 


TGT 


GAA 


AAA 


GAG 


AGA 


GAC 


CTT 


TTT 


AAA 


CTC 


ATT 


TTG 


AAA 


AAA 


Leu 


Ala 


Cys 


Glu 


Lys 


Glu 


Arg 


Asp 


Leu 


Phe 


Lys 


Leu 


He 


Leu 


Lys 


Lys 


125 








130 










135 










140 


GAG 


GAT 


GAA 


TTG 


GGG 


GAT 


AGA 


TCT 


ATA 


ATG 


TTC 


ACT 


GTT 


CAA 


AAC 


GAA 


Glu 


Asp 


Glu 


Leu 


Gly 
145 


Asp 


Arg 


Ser 


He 


Met 
150 


Phe 


Thr 


Val 


Gin 


Asn 
155 


Glu 



657 



705 



753 



GAC TAGCTATTAA AATTTCATGC CGGGCGCAGT GGCTCACGCC TGTAATCCCA 8 06 

Aso 

GCCCTTTGGG AGGCTGAGGC GGGCAGATCA CCAGAGGTCA GGTGTTCAAG ACCAGCCTGA 866 
CCAACATGGT GAAACCTCAT CTCTACTAAA AATACTAAAA ATTAGCTGAG TGTAGTGACG 92 6 
CATGCCCTCA ATCCCAGCTA CTCAAGAGGC TGAGGCAGGA GAATCACTTG CACTCCGGAG 98 6 
GTAGAGGTTG TG3TGAGZCG AGATTGCACC ATTGCGCTCT AGCCTGGGCA ACAACAGCAA 1046 
AACTCCATCT CAAAAAATAA AATAAATAAA TAAACAAATA AAAAATTCAT AATGTGAAAA 110 6 
AAAAAAAAAA AAAA 112 0 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DMA 

(vi) ORIGINAL SOURCE: 

(A) ORGANI SM : human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE : 

(A) NAME /KEY: exon 

(B) LOCATION: 1 . .13 5 

(C) IDENTIFICATION METHODS : S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AA AAC CTG GAA TCA GAT TAC TTT GGC AAG CTT GAA TCT AAA TTA TCA 4 7 

Glu Asn Leu Glu Ser Asp Tyr Phe Gly Lys Leu Glu Ser Lys Leu Ser 

-5 1 5 10 

GTC ATA AGA AAT TTG AAT GAC CAA GTT CTC TTC ATT GAC CAA GGA AAT 9 5 

Val He Arg Asn Leu Asn Aso Gin Val Leu Phe He Asp Gin Gly Asn 

15 20 25 

CGG CCT CTA TTT GAA GAT ATG ACT GAT TCT GAC TGT AGA G 13 5 

Ara Pro Leu Phe Glu Aso Met Thr Asp Ser Asp Cys Arg Asp 

3 0 " 3 5 4 0 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEN3TH: 134 base pairs 

(B) TYPE: nucleic acid 

(C) STRAMDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 
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( iX) FEATURE : 

(A) NAT-IE /KEY: exon 

(B) LOCATION : 1 . . 134 

(C) IDENTIFICATION METHODS: S 





(xi) S 


EQUENCE 


DESC 


RIPTION: 


SEQ 


ID 


NO : 


4 : 










AT 


AAT 


GCA 


CCC 


C3G 


ACC 


ATA 


TTT 


ATT 


ATA 


AGT 


ATG 


TAT 


AAA 


GAT 


AGC 


Asp 


Asn 


Ala 


Pro 


Arg 


Thr 


He 


Phe 


He 


He 


Ser 


Met 


Tyr 


Lys 


Asp 


Ser 


40 










45 










50 










55 


CAG 


CCT 


AG A 


GGT 


AT 3 


GOT 


GTA 


ACT 


ATC 


TCT 


GTG 


AAG 


TGT 


GAG 


AAA 


ATT 


Gin 


Pro 


Arg 


Gly 


Met 


Ala 


Val 


Thr 


He 


Ser 


Val 


Lys 


Cys 


Glu 


Lys 


He 






60 










65 










70 




TCA 


ACT 


CTC 


TCC 


TGT 


GAG 


AAC 


AAA 


ATT 


ATT 


TCC 


TTT 


AAG 








Ser 


Thr 


Leu 


Ser 


Cys 


Glu 


Asn 


Lys 


He 


He 


Ser 


Phe 


Lys 














80 










85 

















47 



95 



134 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 87 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE : placenta 

(iX) FEATURE : 

(A) NAME /KEY: exon 

(B) LOCATION: 1 . . 87 

(C) IDENTIFICATION METHODS: S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 



G 



^TAAAG ATG GCT GCT GAA CCA GTA GAA GAC AAT TGC ATC AAC TTT GTG 
Met Ala Ala Glu Pro Val Glu Asp Asn Cys He Asn Phe Val 
3 5 -30 -25 



GCA ATG AAA TTT ATT GAC AAT ACG CTT TAC TTT ATA G 
Ala Met Lys Phe lie Asp Asn Thr Leu Tyr Phe He Ala 
-20 -15 -10 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE : nucleic acid 

(C) STRAND3DNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION : 1 . . 87 

(C) IDENTIFICATION METHODS: S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



50 



87 
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• * 

CT GAA GAT GAT G 
Ala Glu Asp Asp Glu 
-10 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 216 7 base pairs 

(B) TYPE: nucleic acid 

(C) ST HANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME /KEY: exon - 3 ' UTR 

(B) LOCATION : 1 . .216 7 

(C) IDENTIFICATION METHODS: E 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



GAA 


ATG 


AAT 


CCT 


CCT 


GAT 


AAC 


7\ »T\-i 




GAT 


AC A 


AAA 


AGT 


GAC 


ATC 


ATA 


48 


Glu 


Met 


Asn 


Pro 


Prj 


Asp 


Asn 


lie 


Lys 


A.Sp 


Thr 


Lys 


Ser 


Asp 


He 


He 




85 










9 ^ 










95 










100 




TTC 


TTT 


CAG 


AG A 


AGT 


GTC 


CCA 


G 3 A 


CAT 


GAT 


AAT 


AAG 


ATG 


CAA 


TTT 


GAA 


96 


Phe 


Phe 


Gin 


Arg 


Ser 


Val 


Pro 


G I v 


H3.S 


Asp 


Asn 


Lys 


Met 


Gin 


Phe 


Glu 










105 










110 










115 






TCT 


TCA 


TCA 


TAC 


GAA 


G3A 


TAC 


TTT 


CTA 


GCT 




GAA 


AAA 


GAG 


AGA 


GAC 


144 


Ser 


Ser 


Ser 


Tyr 


Glu 


Gly 


Tyr 


r ne 


Leu 


Ala 


Cys 


Glu 


Lys 


Glu 


Arg 


Asp 










120 








125 










130 








CTT 


TTT 


AAA 


CTC 


ATT 


TTG 


AAA 




GAG 


GAT 


GAA 


TTG 


GGG 


GAT 


AGA 


TCT 


192 


Leu 


Phe 


Lys 
135 


Leu 


lie 


Leu 


Lys 


Lys 
1 4 C 


Glu 


Asp 


Glu 


Leu 


Gly 
145 


Asp 


Arg 


Ser 




ATA 


ATG 


TTC 


ACT 


GTT 


CAA 


AAC 


GAA 


GAC 




CTAT 


T AAAATTT C A 


TGCCGGGCGC 


246 


lie 


Met 
150 


Phe 


Thr 


Val 


Gin 


Asn 
155 


Glu 


Asp 



















AGTGGCTCAC GCCTGTAATC CCAG 3CCTTT GGGAGGCTGA GGCGGGCAGA TCACCAGAGG 3 06 

T^AGGTGTTC AAG AC CAG C C TGA3CAACAT GGTGAAACCT CATCTCTACT AAAAATACAA 3 66 

AAAATTAGCT GAGTGTAGTG A3CCAT3CCC TCAATC3CAG CTACTCAAGA GGCTGAGGCA 426 

GG AG AAT CAG TTGCACTCCG GAG 3TGGAGG TTGTGGTGAG CCGAGATTGC ACCATTGCGC 4 86 

TCTAGCCTGG GC AAC AAC AG CAAAACTCCA T C T C AAAAAA TAAAATAAAT AAATAAACAA 54 6 

ATAAAAAATT C AT AAT G T G A ACTGTCTGAA TTTTTATGTT TAGAAAGATT ATGAGATTAT 6 06 

TAG T C T AT AA TTGTAATGGT GAAATAAAAT AAATACCAGT CTTGAAAAAC AT CAT T AAG A 666 

AAT G AAT GAA CT TTC AC AAA A 3 C AAA C AAA CAGACTTTCC CTTATTTAAG TGAATAAAAT 726 

AAAAT AAAAT AAAATAATGT TTAAAAAATT C A T AG TTT G A AAACATTCTA CATTGTTAAT 786 

TGGCATATTA ATTATACTTA ATATAATTAT TTTTAAATCT TTTGGGTTAT TAGTCCTAAT 84 6 

G AC A AAAG AT ATTGATATTT GAACTTTCTA ATTTTTAAGA ATATCGTTAA ACCATCAATA 906 

TTTTTATAAG GAGGCCACTT C A C T T G A C AA AT T T C T G AAT TTCCTCCAAA GTCAGTATAT 966 

TTTTAAAATT C AG T T T G AT C CTGAATC3AG C AAT AT AT AA AAG ■ 3 G AT TAT ATACTCTGGC 10 2 6 

CAACTGACAT TCATCCTAGG AATGCAAAGA TGGTTTAATA T C C T AAAAT C AATTAACATA 1086 

ACATACTATA T T AA T AAA G T ATCAAAACAG TATTCTCATC TTTTTTTCTT TTTTCACAAT 114 6 

TCCTTGGTTA C AC TAT CAT C T C AAT A3 AT G C AG AAAAAG C ATTTGACAAA ATCCAATTCA 12 06 

TAATAAAAAT TCTCAAACTT G AAAG AG AAC AT C AT AAAG G CAT C TAT GAA AAACCTACAG 12 6 6 

CT AAT ATC AT AC TT AAC GAT GAAAAACT 3A AT TAT TTT AC CCT AAG ATC A AG AAT AAT GC 1326 

AAGCATGTCA GCTCTTGCAA CTTCTATT 3 A ACATTGTACT GGAGGTTCTA GCCAGAGCAA 13 86 

C CAT AC AAT A AATAAAAATA AAAG G C AC CC AG AT TAG AAA GGAAGTCTTT ATTTGCAGAC 14 4 6 

AACATGGTTC TTTATGCAGA AAAC C G T 0 AG G AAT AC AC AC ACATGTTAGA ACTAATAAGT 15 06 

TCAGCAAGGT TGCAGGTTGC AAT ATC AAT A T G C AAAAAT A CATTGAAGGC TGGGCTCAGT 156 6 

GGAGATGGCA TGTACCTTTC GTCCCAGCTA CTTGGGAGG 3 TGAGGTAGGA GGATCACTTG 16 26 

AGGTGAGGAG TTTGAGGCTA TAGTGCAATG TGATCTTGCC TGTGAATAGC CACTGCACTC 16 86 

GAGCCTAGGC AA C AAAG T G A GACCCCGTCT CCAAAAAAAA AAATGGTATA TTGGTATTTC 174 6 

TGTATATGAA CAATGAATGA TCTGAAAACA AGAAAATTCC ATTCACGATG GTATTAAAAA 18 06 

AAT AAAAT AC AAATAAATTT A G C AAAAT AA TTATAAAACT TGTACATCGA AAATTTCAAA 18 6 6 



34 




GCACTCTGAG GGAAATTAAA GATGATCTAA ATAATTGGAG AGACACTCTA TGATCACTGA 1926 

TTG 3AAAATT CATTCAATAT TGTTAA3ATA ACAATTGTCC CCAAATTGAT GCATGCATTC 19 3-5 

AATTTAGTCT TCATCAAAAT TCCAGCAGG 3 TTTTTGCAGA AATT GAC AAG CTGTACCCAA 204-5 

AATGTATATG GAAATGAAAA GACCCAGAA3 AGCAAATAAT T T TT T AAAAA CAAAGTTGGA 2106 

AAACTTTTA 3 TTCCTAATTT T AAAA 2 T T A 2 TATAAACCTA AAGT T AT C AA GACCATTTAG 216-5 

T 2167 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1334 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TO? D LOGY' : linear 

(li) MOLECULE TYPE: Genorr.ic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: hu~an 

{ F ) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME /KEY : intrcn 

(B) LOCATION: 1 . .1334 

(C) IDENTIFICATION METHODS: E 

(xi) SEQUENCE DESCRIPTION': SEQ ID NO: 8: 

GTATTTTTTT TAATTCGCAA ACATAGAAAT GACTAGCTAC TTCTTCCCAT TCTGTTTTAC 60 

TGCTTA : "ATT GTTCCGTGCT A3TCCCAATC CTCAGATGAA AAGTCACAGG AGTGACAATA 12 0 

ATTTCACTTA C AG G AAA 2 T T T AT AAG G C AT CCACGTTTTT TAGTTGGGGT AAAAAATTGG 180 

ATACAATAAG ACATTGCTAG GGGTCATGCC TCTCTGAGCC TGCCTTTGAA TCACCAATCC 24 0 

CTTTATTGTG AT T G CAT T AA CTGTTTAAAA CCTCTATAGT TGGATGCTTA ATCCCTGCTT 300 

GTTACAGCTG AAAATGCTGA TAGTTTACCA GGTGTGGTGG CATCTATCTG TAATCCTAGC 360 

TACTTG3GAG GCTCAAGCAG GAGGATTGCT TGAGGCCAGG ACTTTGAGGC TGTAGTACAC 42 0 

TGTGATCGTA CCTGTGAATA G2CACTGCAC TCCAGCCTGG GTGATATACA GACCTTGTCT 4 80 

C T AAAATT AA AAAAJ\AAAAA AAAAAAAACC TTAGGAAAGG AAATTGATCA AGTCTACTGT 54 0 

GCCTTCCAAA ACATGAATTC C.AAATATCAA AGTTAGGCTG AG T T G AAG C A GTGAATGTGC 6 00 

AT TCT T T AAA AATACTGAAT ACTTACCTTA ACATATATTT TAAATATTTT ATTTAGCATT 66 0 

T AAAAG T T AA AAACAAT CTT T T AG AA T T ■ 2 A TATCTTTAAA AT A C T C AAAA AAGTTGCAGC 72 0 

GTGTGTGTTG TAATACACAT TAAACTGTGG GGTTGTTTGT TTGTTTGAGA TGCAGTTTCA 78 0 

CTCTGTCACC C AG G C T G AAG T3CAGT3CA3 TG2AGTGGTG TGATCTCGGC TCACTACAAC 840 

CTCCACCTCC CACGTTCAAG CGATTCTCAT GCCTCAGTCT CCCGAGTAG G TGGGAT.TACA 5 00 

GGCATGCACC ACTTACACCC G 3CTAATTTT TGTATTTTTA GTAGAGCTGG GGTTTCACCA 96 0 

TGTTGGCCA3 GCTGGTCT CA AA3CCCTAAC C T C AAG T G AT CTGCCTGCCT CAGCCTCCCA 102 0 

AACAAACAAA CAACCCCACA GTTTAATATG TGTTACAACA CACATGCTGC AACTTTTATG 103 0 

AGTATTTTAA TGATATA 3 AT TAT AAA AG G T TGTTTTTAAC TTTTAAATGC TGGGATTACA 1140 

GGCATGA 3CC ACTGTGCCAG GCCTGAACTG TGTTTTTAAA AATGTCTGAC CAGCTGTACA 12 0 0 

TA3TCTCCTG CAGACTGGCC AAGTCTCAAA GTGGGAACAG GTGTATTAAG GAC TAT C CTT 12 6 0 

TG 3TTAAATT TCCGCAAATG TTCCTGTGCA AGAATTCTTC T AA C TAG A G T TCTCATTTAT 132 0 

TATATTTATT TCA3 13 3 4 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEN0TH: 4773 base pairs 

(E) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(n) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 
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(A) NA:-:e/?'.EY: ir.tror. 

(B) L0 3ATI3N: 1 . . 4773 

(C) IDENTIFICATION METHODS: E 



f -J- 

[ X 


i ; 


err "\T'TTr v ''-" , T? ritr c T r '7Tri x l - 
b2, K )'J^.» _i :J ii o _ K 1 r 1 J-'Ji-J . 


S -"'"i IH N ~ : 


9 : 






GTAAGA 3T 




— , _ ^, -p. ^ „, rp -p _ .p, ,_p rp ~l 1 7 .mi 

bL^ i J r. . i 1 bl 1 1 i _ An 1 L 




AAT 3 AAT AT. A 


A T"pAGAA =-TA 


6 0 


TAACATTA 


TT 
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AAT 
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TGAG 3CC 


hA.A 
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GTGCAGGCAG CACTACCAGT TGGATAACCT GCAAGATTAT AGTTTGAAGT AATCTAACCA 3720 

TTTCTCACAA GGGCCTATTG TGTGACTGAA ACATACAAGA ATCTG 2ATTT GGCCTT 2T AA 3780 

GGCAGGG2CC AGCCAAGGAG ACCATATTCA GGACAGAAAT TCAAGACTAC TATGGAACTG 3 84 0 

GAGTG 2TTG 2 CAGGG AAGA 2 AG A GTCAAGG AGTGCCAACT GAG 2CAATAC AG IAGG :TTA 3900 

CACAGGAA:C CA3GGCCTAG C 2CTACAACA ATTATTGGGT CTATTGACTG TAAGTTTTAA 3 96 0 

TTTCAGGCTC CA2TGAAAGA G T AAG G T AAG ATTCCTGGGA CTTTGTGTGT CT2TCACAGT 4 02 0 

TGGCTCAGAA ATGAGAAGTG GTCAGGCCAG GGATGGTGG 2 TTA 2AGGTGG AATCCCAGCA 4 03 0 

CTTTGGGAGG CCG AAGTGGG AGGGTGAGTT GAGGCCAGGA GTTCAGGACG AGCTTAGGCA 414 0 

A AAAAGTGAG ATACCCCCTG AGGCCTTCTC T AC AAAAAT A AAT T T T AAAA AT T AG G C AAA 42 00 

TGTG GTGGTG TATACTTAGA GTGCCAGGTA CTCAGGAGGG TGAG 3CAGGG GGATTGGTTG 42b 0 

A3CCCAG3AA TTCAAGGCTG CAGTGAGGTA TGATTTCACC ACTGGAGTTC TGGGTGGGGA 4320 

ACAGAGCGAG ACCCTGTCTC AAAGCAAAAA G AAAAAG AAA CTA 3 AACTAG CCTAA3TTTG 4 33 0 

T33 3AG3AGG TCATCATCGT CTTTAGCCGT GAATGGTTAT TATAGAGGAC AG.AAATTGAC 4 44 0 

ATTAG 2CCAA AAAGCTTGTG GTCTTTGCTG GAACTCTACT TAATCTTGAG C AAA TGTG G A 4500 

CACCACTCAA TGGGAGAG 3 A GA3AAGTAAG CTGTTTGATG TATAGGGGAA AACTA3AGGC 4560 

CTGGAACTGA ATATGCATCC CATGACAGGG AGAATAGGAG ATTCG3AGTT AAG AAG GAG A 4620 

G3AGGTGAGT ACTGCTGTTC AGAGATTTTT TTTATGTAAG TCTTGAGAAG C AAAA CTA G T 4680 

TTTGTTGTGT TTGGTAATAT ACTTCAAAAG AAACTTCATA TATTCAAATT GTTCATGTCC 4 74 0 

T3AAATAATT A3GTAATGTT TTTTTCTCTA TAG 4 773 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8835 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
{ D ) TOPOLOGY : linear 

( li ) MOLECULE TYPE : Genoiric DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: hurr.an 

(F) TISSUE TYPE: placenta 

(iX) FEATURE : 

(A) NAME /KEY : ir.tron 

(B) LOCATION: 1 . .8835 

(C) IDENTIFICATION METHODS: E 

(mi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GTAAOAAATA TCATTCCTCT TTATTTGGAA AGTCAGCCAT GGCAATTAGA GGTAAATAAG 6 0 
CTAGAAAGCA ATTGAGAGGA ATATAAACCA TCTAGCATCA CTACGATGAG CAGTCAGTAT 120 
CAACATAAGA AATATAAGCA AAGTCAGAGT AGAATTTTTT TCTTTTATCA GATATGGGAG 180 
AGTATCACTT TAGAGGAGAG GTTCTCAAAC TTTTTGCTCT CATGTTCCCT TTACACTAAG 24 0 
CACATCACAT GTTA 3CATAA GTAACATTTT T AAT T AAAAA TAACTATGTA CTTTTTTAAC 3 00 
AACAAAA'^AA AGCATAAAG.A GTGACACTTT TTTATTTTTA CAAGTGTTTT AACTGGTTTA 3 60 
AT AG AAG 2 C A TAT AG ATCTG CTGGATTCTC ATCTGCTTTG CATTGAGACT ACTGCAATAT 420 
TGCACAGAAT GCAG :GTCTG GTAAACTCTG TTGTACACTC ATGAGAGAAT G G G T G AAAAA 4 80 
GACAAATTAC G T 0 T TAG AAT T 7 1 T T A G AAA T AGCTTTCACT TTAGGAACTC CCTGAGAATT 54 0 
G3TGCTTCAG AGTGGTAAGA TAAATAAGCT TCTCTTTAAA CGGAATCTCA A G A C A G AA T ■ 2 6 00 
A3TTACATTA A„AAG ^W^A AAAAAT TTGC CCATGGTTAG TCATCTTGTG AAATCTGCCA 660 
CAGGTTT GGA C T G G : 2 G T A C A ATTGGATAAT ATAGCATTCC CCG AG AT AAT TTTCTGTCAG 72 0 
AAT T AAG GAA AG 3G 3TGAAT AAAT AT C T C T GTTTGAAGTT GAATAACAAA AATTA 3GACC 78 0 
C G G T .AAA T T T TAGGGCTCCT G AAA T T C G T C TTTTTGCCTA TATTGAGCTA CTTTACGTTC 840 
T AT T AAAT G T TCTTT CAGGC CAGGTGCACT AGCTCATGCC TAGAATCTCA GGCAG3C GTG S^00 
AGCCCAG3AA TTTG AGACCA GGCAGGG 2AA CACAGTCTGT A LAAAAAAAT AAAAAAT TAG c j6 0 
CTGGGTGTGT T 3GTGCATGC CTGTAGAACT ACTCAGGATG CTGAGGACTG CTTGAGCCCA 1020 
G3ATAGCCAA ATGTGTGGTG AGTTCAGCCA CTAAACAGAG C GAG AC TTTG T 1 2 AAAAAAA G 1080 
AAAC AAAAAA AC AAA: AAA C TTC CTTCAAA ATAACTTTTT ATCTGCAATG TTTTCCTATT 114 0 
GCCTGTGAGA T T AA AT T T AC TCTTTTACCT G AT TTC C AAA GGG CTCCATA ATCTAATCCG 1200 
ACTTTACCTT GTGTTC ACTG CAAAATAGGA GGACTGTTCC ACTA 2 AAT GG AAAAA T C A C A 12 6 0 
G 5TTG 2 3TGG AGTGGGTCAG TGCTGTAATC CCAACACTTT G3AAGGCCAA GGGAG GTGGA 1320 
TTGCTTCAGC TGAGGAGTTC AAGAGCAGCC TGGGCAACAT GGCAAAAACC CTGTCTCTCC 1380 
AAAACATAGA AAAATTAGGG AGATGTGGTA GTATGTGCGT GTAGTCCCAA CTACTCAAAA 14 4 0 
GGCTAAG3CA AGAGGATCAG TTGAGCCCAG GAGGTCAAGG CTACAGTGAG CCATGTTTAC 1500 
TGTGTCAGTG CACTCGAGGC TGGGTGATAG AGCAAGACCA TGTCTCAAAA AAAAAAAAAA 156 0 
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GAAAA3AAAA G AAAAAAA 3 A T 7G 1'TCTATT CAGTT 3A.CC I CCA3CACAA3 ATTG . iT^ijA 1620 
TTATCACATA AATGGTGGTG CATTGCOTTG TGTATGTATT C AAATCT , i A AG "AT i 0 i i T 16 8 0 

gagattgaa: tcaattctoo tttt3aaact aggtcattta aagtaoat 3 a gttcoatttt 1740 

gattttcttg ctttgagtct a3a3a2t3aa aaa3aa7aao ttaaaaa3t7 a77ttttaa3 1800 

ttttgtggta ctgtgagttg tt 3aaca3t z ag at a 3 ac go att tataata a 3.atggcaga 186 0 

atgttgaa3 3 ataaaatgat tta7.a3aa37 gaaaa377a3 gttttgatct t3ttgot3to 1920 

AAGATGACTA CCTACCTGAT CTCA3 3TAAT T.AATTATGTA GCATGOTG3 3 T3ATTTG.ATG 193 0 
CCATA^CTAT T G AA AG G AT T33AATT3GA CAG7AAGGAT AAACATAATC ATAGTTGGTT 2040 
TTGAA3TTGA AGGCATTTTA A3TTTTAATG TAGTAGTATG TTTGTTGTTG TTGTTGTTGT 2100 
TTGA3ATGGA GGCGTGGTGT GTCA3C3A3G CTGGAGTGCA GTG 3 3ACG AA CTGG3 2TCAG 2160 
TG3AAOGTGT GCCTCATGG3 TTCAAT 3AGT TATTGTGGGT CAGTGTGGGA AGTA3GTG33 2220 
ACTAOAA3G3 AGATG 20 AGO AT3G2T3 3GT AATTTTTGTA TTTTTA3TA3 AAAGAGG3 3T 2280 
TGAGGATGTT GGCGAGGGTG GTGT2GAACT CGTGACCTGA AGTGATG G A3 CCGCCTCG3 3 2340 
CTCCCAAAGT GGTGGGATTA CAGG 3 .AT AAG CCACCGTG2C CAGCCTAAT.A GTAT 3TTTTT 24 00 
AAAC7C77AG 7GGC7TAACA ATGGTG 3TTG TATAATAAAT A T G 3 0 A T AAA TATTTACTGT 246 0 
C T T A G AA T T A TGAAGAAGTG GTTA 3 TAG 3 G CGTTTGCGAO ATATGAAT 3 3 T^GTGTCCiT 2 52 0 
A G A G 0 T T T AA TTAGAGTCTA GAATTG3AGG TTGGTAGA3G TG3AACA3A2 CiiAAA-GAiT 2580 
GA^TAG3CAA CTTCCTTGTC C7AA.T3A.G33 AA3TGAGA3G C T T AAAA T T A AGTGAGT i 3C 2640 
C G ^ AG A " AAA ACTGGAACTC ATGT3T3CTA ATTTGCAT3A T G AAA T T G T A GGATTGAGTA 2 70 0 
GCCTCTGGCT AGTTGTCAAA GTAT 1 3 3 ATA ACTAAATTTT TAT3TGT 3TT TTAAA3AA3A 2760 
AATTGTCACT GCTTACTCCT GG3A333TGT TTGTGAGGTG GTTTATAACT CTTAA.VA^ 2820 
AAAAAG T C AG TAGTGTGAGA ATTTTA3A3G AA.-. T A G T C AA AGGATTTTTA TCCAATGGAT 2 88 0 
CTATAATTTT CATAGATTAG AG T T .AAA T 3 A AA 3 AAA G A C G G AT G A G AAAG G AAGAG 3 AAA 2940 
ATTG AG 3 AG A GGAG3AATGG GGAT 3A3AAC AGAGTAGTTG TAATGAGTOA TAGATGTAGT 3000 
G A G AA G T AA C AAGAAGAATT G T AA 3 AAAA.T AAGAATGAAG AATTCAAAAT GAAGAGATGA 3 06 0 
AVTAAAAAGA AAGTAGTAGG G AAA AA T G 3 A GAA3AGATTA GAAAAATTAT TGTATTTTTA 3120 
AAATTCTGTT TTOAGGCTT 3 CCTC 3TGTTC TTGGTGGTTG TCATTGGTTT TGA3GT3 3AG 318 0 
G3AAA3TTTA AGATGGAAAA AATATATATA TTGTAGAGAT CCGTTTGTAC GGTGTT 3TCA 3 24 0 
TGGGAACAAG GTTTATCATA G3AAACTTTT AT T O AT A C AA CATTTATTGA GTTGTTAGTG 3 3 00 
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4380 


TGTTTGAGG 3 


A C7 A G AG AT G 


A. _t 3 - GA 3. AAA 


TCCCGCTGTT 


TA.GTA70707 


TG.AATTOATT 


4440 


ACGTTAATTT 


AT AG' 3GTGAT 




T T G T T AAAA T 


TA. 3'Z 1 TC7iG 3A. 


GTTG77G7 3T 


4 5 0 0 


GA/3CTCTG7G 


CAATOTTGTT 




CA7G37777G 


TTOTTGG'I - 3G 


TGGTGG'I'G 37 


4 56 0 




TTTO 3GTTTT 


G 3 r 3 3 3 3A3G 


CTG.AAGTG 3A 


GTGGAG-3.A0T 


T07v37G-37AO 


4620 


GA 3A3 30 TOG 


TGG 3TTTAA3 




7 C 3 307^3 TAG 


CTOGGAOTAC 




4 6 8 0 






GTGTT 3TCAG 


T A. G A -3 .A O A ' 3 G 


GTTTC.AOO.AT 


GT7"G , 3T-.AA 3G 


4 74 0 




ACTCCTGACC 


1GAA3:AA7G 


CAs. 3CC7iCC7C 


AG 3GT0 : 3G A7\ 


A j I' 3 . TG 3 jt.^ 


4 800 


TTAOAGGOAT 


GAG 1 3 O AG TGG 


aca-3G3a:ca 


GATOC7\TTGT 


TT.ATGT TG 3T 




4860 


GTT'I TTAAAA 


C A G AAA T T T G 


ACCATATCTT 


TG7G0AA777 


7A 3 7 O A G T A T 


TTTT"'TTTTC 


4920 


A3 3AA.AAAAG 


AGTTOAAACT 


CT77.A37 37G 


C T T A C AO .AA 3 


GO 3 7TTGT.AG 


TCTG.ACTOTT 


4980 


OTTTOOAAG 3 


TTT CATC AAA 


G TATA 3TG 3 A 


A3TTAC.ATTT 


TAT 3TG.AATT 


GAAT3A 3 3 3 A 


5 04 0 


AGG'j; i A I AAA 


AATTATAGTT 




AAAT'3G.AAAT 


7AT3TTAA3T 


CTTO-3A..A.ATA 


5100 


GTT FATCTA3 


7ATGAOATAA 


TTTO AAA 3 3T 


GTCAGGTCAA 


A.T 3 .AG T T 7iT A 


AAGT3T7AAC 


5160 


AGTATTG GGA 


CATG 3AAGTG 


TOT ITT AT AC 


TT 3 3TA3.AAT 


TATOTG 3T FC 


CATGTOA.T TA 


5220 


TTATGTAAAT 


TAGACTTTAA 


A T AA 3 T G A- 3 A 


7,GTTCTTG7,G 


ACATACA-3 3T 


3'ATT.ATT 3TG 


5280 


CTTTTTAAAO 


ATAATTTTAA 


ATAATTTTAT 


7 v T A. T G A T 7 A T 


GTTATCC AAG 


TGCTAAGGGA 


5340 


TGTATTGTTA 


CTGGTGTGGA 


A.AAAAAAAAA 


7 AAA A AAAA C 


T C C AAA T .AAA 


TATGTTGAAA 


5400 


CGAAGTTTAT 


ATG 3 AAG AAA 


AGAATATT.AA 


AAAG 300 AAA. 


GTACCACCAT 


7ATAGGOTGT 


5460 


GTG 3 AG AGG 3 


CAGG- 3TAGAA 


AACAOTA GTA 


ATAATGGTGA 


G AAA G T T G AA 


7AAAGAAAGA 


5520 


AAG 3AAGAAT 


ATGCTTTGGT 


TGTTGTAG 3T 


TT7i.TGTA.CTC 


CAA3AATATC 


TGCTOTOAAA 


5580 
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CTTTTAC3TT TTTTCCAAAG AAAA '3 T T AA C TTTGGCTGGG CGCAGTGG3T CTTGCCTGTA 564 0 
GTCC^A'^T TTGGGAG 323 AAGGC3G3CA GATCACCTGA GGTCAGGA3T TTGAGACCAG 5 7 00 
PCTGA-CAAA AATG3AGAAA C3CGCCCCCC TCACTACTAA AAG AAT AC .AA AATTAGGCCG 576 0 
GGCA"A3T-G CTTACC"T3 T3AT:2CAG3 ACTTTGGGAG GCCGAAGCAG GAAGATCACC 5820 
TGAGGT^A^ AGTTCGA 3A3 CAG22AT33A GAAACCCGTC TCTACTAAAA ATACAAAATT 5880 
A "CGGG 3 3T GGTGGTG 3AT GACT3TAAT3 CCAGCTACTC AGGAGG CT AA GGCAGAGAAT 5 94 0 
CACTTGAACC CAGGCAGTG 3 A3 3TT3CA3T GAG 3CGAGAT CGTGCCATTG CACTCCAGCC 6000 
TGGGAAACAA GAGCGAAACT C7G7A7CCAA AAAAC AAAAG AAAAG AAAAG GTAACCTTGA 6 06 0 
A-TATGTGAG ATCTTTA 3AA ATGCATTCTT TCTGTAAAAT GTGACTACAT TTGCCTTATT 612 0 
TATGGTAA AA ATGTTGA 3 3 2 CTCAAACAAC CCATATTTTC TCGGTCTCCC CG2TGCCTAG 618 0 
CCTTTGTTCA CATTGCTT3T TCTT3 3T3GA A3CTCTTCCT CTGGCCTTGA AAATGCCTGC 6240 
TTCTCTTTCA AG3TAG3A3A GTCATCACTT TCTGTGGTAA CCTTCTCCAG CA3CATCAAA 6 300 
C A 3 AAAG AAT GAATCTC7T 3 T AAA T T C AG C TCTTACGTCA TTCATTACAT TATTTTGTAA 636 0 
CT^TTTATAG ATTCTT 3T3T CCCACTAGAC TCTGAGTCAC TGGAGAGTAG GAG C C A AC T C 642 0 
T^ATTCATGT GTGGTTT3 3T CAGCTACTGG CCACATTCCT GATGCATAGT TAATGCTCAA 648 0 
A~^TTAACTG G T G AAT C A 3 2 T 3AAATATTG TCCTTCTCTA AATCCATTCA CTCATTGACT 6 54 0 
AA 2TATGTA f ^ T C AAAA.T A 3 T AAACACCAGT AATTTAATCC AATTCCTGCC CATACTGCTT 6 6 00 
G—TA^AT^TC AGGTGAAT7A G T T T G A T AAA TATGTGTGTA TTACATAATA TTAAAGTATG 6660 
7^AGAA"T CATGCTAA7C ATAATTCACA ACTGATAACT AAT C AAA CAT AAATGCTCTC 6720 
3TTAA7AA ATGTCTGCCT T 2TCAGTTAA TGCAGTCATT AACAAACACC TTCTGATGCT 6780 
GATAATA^GG CCTTGTTCA3 CAATCAAG3C A T AAA G G TG A ATAAAGAACA TGCCCTCGTG 6 84 0 
GA^CTCA^AG CCTAGTCATT A7TGTTCTGA TTTTTAATAT TAATGTTGGT TTGGGTTTTG 6900 
GT— AAAAA 7G TTTAGAC77A TCTTAGTGAT CTTTTCATCC TTTGCTATAT TATTTTTCTC 6 96 0 
TAAGAGTCTT CCTTATCCC2 7 C C 7 77 AAAA AACTAGGTGA TAATTCTAAA TTGTAAATTT 702 0 
A AAT ATT AT A AAT AG C T T A T AAAATTTAAT ATTTATAATA TTTAAATGTT TGATAAATAT 7 08 0 
r^aj.ATTT'-a T AA T AT T T AA ATGTTTATTT AAATTCATTT GTACATCAGT TTTTATTTTA 714 0 
T^TAA-Tl^G TTGGCCAGGC ATGGTGGCTG ACACCTATAA TCCCAGAACT T T GAG AG G C C 7200 
A^G^CA^A PJ- C C AT T T 3 A G7T3AG3AGT TTGAGACCAC CCTGGGCAAC GTGGTGAAAC 7260 
CT7G7C7C7A C C AAA C A 7 A 7 G AAAA C T T AT CTGGGTGTGG TGGCACGCAT CTGTGGTCCC 732 0 
A^ATGG-^T CCrAGs"7AA GA7GC-3AGAA TCG 3TTGAAC CCAGGTGAGA GGGGTGGGGT 7380 
GGATGT^JA G7GA3C7GAG A7CG7 3 2CAC TGCACTCCAA CCTGGGTGAC AGAGTGAGAC 744 0 
7 C AT C 7 C AA AAAAAAAAAA 7G77A7 37AA A.T AA 3 AT AAA T T T AAT AA C T GTTCGCACTT 750 0 
A'— TGA~3AT AAGGAA3TAA A3CTA3ATAA AAC T AT C AAA TAAGGCCTGG GTACAGTGAC 756 0 
~~CATGCr7GT AA7C7CAAGC A2777 3 3GAG G3CAAAATTA TACAAAG7TA GTTGTATAAC 762 0 
A^CAACTAAC AACTATTT7G G3GTCA3C77 AAT TC AG ATT AATTTTTTTT AAACTGAGTT 76 SO 
77AAATTCCT GCTTAC7C7A CCA7A3A7GC TA3G CCTCAT ATTATGCTAG AAAAATTTTG 774 0 
A3CACACATT TATGAATACT CTCCTG2ATA CCA777AA7T TTTAAACAAA TTTTAATGCA 78 0 0 
G TAT A TAT 37 GCC77777A3 CAACA2A77A AA7AA7AAGA TCTACTGTGA GGACTAAATT 7 86 0 
^CTGTAATTT C AAAG T AG 7 A ATGAG7T7AA ACCATGTCTC AAGATCTCTG CAATAACTGT 7 92 0 
A3CACAACAG AAAA T A 3 '3 T .A T T T C T A T T AA TGACAGAGTC ACAAGTACTA CT AAT AAT AC 7 98 0 
T ^TGGTTT 37 TTCCT-3 3AA3 TAATCA7GGG AG G .AAT G C T A AAT T T C AG AG G T T G G TG AAA 8 04 0 
A7A2A7G7G7 A77TT777C2 C AAT 7 AAAG T TCACAGATTT CTCACACTGA GAACTCCTAT 810 0 
^3P-^AA"-A AATTCT 3 3.AA 3CTGCACAC CGTATTGGAA GAAGGGCAGA AAGG AAAAG C 8160 
A^ATGGAA3G AT7TAAA777 7777 2AAATC CTGTATCCCT TGATTTTACA GCAAGATTGT 82 2 0 
A~TTATG~*-T TA^77 37 -77 AAAAATATAG TAT AA T C G A G ACTCCAGATC AAAAA.TCACC 8280 
C. "AGO 7 "AG 3 GAGAAA3.AG 3 G 2CAG:AAAT GCCAGAGCCC TTCAGCCTTC TCCCACCCTG 8340 
CCTGTACCCT C A : 3 A.T ' 3 3 AA 3 CA777777TA TC.ATTGTTTC ACCTTTAGCA TTTTGACAAT 8400 
GAAG7CAAAA ACCTTCA3 3C 7CTCACCCA7 AG 3 AACCCA 3 TGGTTGTAAG AGAAGGATGA 84 6 0 
AGCCAGT7CT TCCTAAA3 3 3 CACGATTA3A TGTGTTTATG GCATCCTCAG GTG AAACTAT 8 52 0 
a'^TTATATTG AAAA7A7A77 7A7A77TC7C AA G G AAT A C T A3 AAT AAT G A TTCAGTTCAG 8580 
TACTAGG-CA TTTATCT.A C TTT AT AAT A T7G77TAAT 3 A ' 3 AAAA T G 3 T TTCTATCTTC 8640 
C.AAATAT 3 TG ATGATTTG7A AGAGAACACT T.A.AAC ATG 3 3 T ATT CAT AAG CTGAAACTTC 8700 
TG3CA777A7 T G AA T G 7 C A A. OATT3TT3A7 C A 3TA7AC7A G 3TGATT AAC T3ACCACTGA 876 0 
ACTTG AA ~ : G7 AGTAT.AAAGT AGTAGTAAAA G3TACAATCA TTGTCTCTTA ACAGATG3CT 8 82 0 



(2) inf:r:-ation ear se7< id i;j:H: 

(i) se^uente characteristics : 

(A) LENGTH : 1371 fcase pairs 

(E) TYPE: nuvleic acid 

(C) STFAOEDl-TSS : double 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: Genomic DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION : 1 . . 1371 

(C) IDENTIFICATION METHODS: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GTAAGGCTAA TGCCATAGAA CAAATACCAG GTTCAGATAA ATCTATTCAA T T AG AAAAG A 6 0 

TGTTGTGAGG TGAACTATTA AGTGACTCTT TGTGTCACCA AATTTCACTG TAATATTAAT 12 0 

GGCTCTTAAA AAAATAGTGG ACCTCTAGAA ATTAACCACA ACATGTCCAA GGTCTCAGCA 18 0 

CCTTGTCACA CCACGTGTCC TGGCACTTTA ATCAGCAGTA GCTCACTCTC CAGTT SGCAG 24 0 

TAAGTGCAGA T C AT G AAAA T CCCAGTTTTC ATGGGAAAAT CCCAGTTTTC ATTGGATTTC 3 00 

CATGGGAAAA ATCCCAGTAC AAAACTGGGT GCATTCAGGA AATACAATTT C C C AAAG C AA 36 0 

ATTGGCAAAT TATGTAAGAG ATTCTCTAAA TTTAGAGTTC CGTGAATTAC ACCATTTTAT 42 0 

GTAAATATGT TTGACAAGTA AAAATTGATT CTTTTTTTTT TTTTCTGTTG CCCAGGCTGG 4 80 

AGTGCA3TGG CACAATCTCT GCTCACTGCA ACCTCCACCT CCTGGGTTCA AGCAATTCTC 540 

CTGCCTCAGC CTTCTGAGTA GCTGG3ACTA CAGGTGCATC CCGCCATGCC TGGCTAATTT 6 00 

TTGGGTATTT TTACTAGAGA CAGGGTTTTG GCATGTTGTC CAGGCTGGTC TTGGACTCCT 6 60 

GATCTCAGAT GATCCTCCTG GCTCGGGCTC C C AAA G T G C T GGGATTACAG GCATGAACCA 72 0 

CCACACATGG CCTAAAAATT GATTCTTATG ATTAATCTCC TGTGAACAAT TTGGCTTCAT 780 

TTGAAAGTTT GCCTTCATTT GAAACCTTCA TTTAAAAGCC TGAGCAACAA AGTGAGACCC 84 0 

CATCTCTACA AAAAACTGCA AAATATCCTG TGGACACCTC CTACCTTCTG TGGAG 3CTGA 900 

AG C AG GAG G A TCACTTGAGC C T AGG AATTT GAGCCTGCAG TGAGCTATGA TCCCACCCCT 96 0 

ACACTCCAGC CTGCATGACA GTAGACCCTG ACACACACAC A C AAAAAAAA ACCTTCATAA 102 0 

AAAATTATTA GTTGACTTTT CTTAGGTGAC TTTCCGTTTA AGCAATAAAT TTAAAAGTAA 10 3 0 

AATCTCTAAT T T TAG AAAA T TTATTTTTAG TTACATATTG AAATTTTTAA ACCCTAGGTT 114 0 

TAAGTTTTAT GTCTAAATTA CCTGAGAACA CACTAAGTCT GATAAGCTTC ATTTTATGGG 12 0 0 

CCTTTTGGAT GATTATATAA TATTCTGATG AAAGCCAAGA CAGACCCTTA AAC CAT AAAA 126 0 

ATAGGAGTTC GAGAAAGAGG AST A 3C AAAA G T AAAAG C T A GAATGAGATT GAATTCTGAG 13 2 0 

TCGAAATACA AAAT T T T A C A TATTCTGTTT CTCTCTTTTT CCCCCTCTTA G 13 71 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 3 8 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS : double 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: hu^.an 

(F) TISSUE TYPE: placenta 

(:X) FEATURE : 

( A ) N AM E / K E Y : ir.tro n 

(B) LOCATION : 1 . . 3383 

(C) IDENTIFICATION METHOEiS : E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GTAAAGTAGA AATGAATTTA TTTTTCTTTG CAAACTAAGT ATCTGCTTGA GACACATCTA 6 0 

TCTCACCATT GTCAGCTGAG GAAAAAAAAA AATGGTTCTC ATGCTAC CAA TCTGCCTTCA 12 0 

AAGAAATGTG GACTCAGTAG CACAGCTTTG GAATGAAGAT GAT CATAAGA GATACAAAGA 18 0 

AGAACCTCTA GC AAAAG AT G CTTCTCTATG CCTTAAAAAA TTCTCCAGCT CTTAGAATCT 24 0 

ACAAAATAGA CTTTGCCTGT TTCATTGGTC CTAAGATTAG CATGAAGCCA TGGATTCTGT 3 00 

TGTAGGGGGA GCGTTGCATA G G AAAAAG G G ATTGAAGCAT TAGAATTGTC CAAAATCAGT 360 

AACACCTCCT CTCAGAAATG CTTTGGGAAG AAGCCTGGAA GGTTCCGGGT TGGTGGTGGG 420 

GTGGGGCAGA AAATTCTGGA AGTAGAGGAG ATAGGAATGG GTGGGGCAAG AAGACCACAT 4 80 

TCAGAGGCCA AAAGCTGAAA GAAACCATGG CATTTATGAT GAATTCAGGG TAATTCAGAA 54 0 

TGGAAGTAGA GTAGGAGTAG GAGACTGGTG AGAGGAGCTA GAGTGATAAA CAGGGTGTAG 6 00 

AGCAAGACGT TCTCTCACCC CAAGATGTGA AATTTGGACT TTATCTTGGA GATAATAGGG 66 0 

TTAATTAAGC ACAATATGTA TTAGCTAGGG TAAAGATTAG TTTGTTGTAA CAAAGACATC 72 0 
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GTCTTGTACC CAT CATC 

TTCCTGGGTT CATATCC 

CAAAGACTCT AATTGGA 

TAATCAGATG GCCAGAC 



CAAAGATACA GTAGCTGAAT AAGATAGAGA ATTTTTCTCT CAAAGAAAGT CTAAGTAGGC 73 0 

AGCTCAGAAG TAGTATGGCT GGAAGCAACC TGATGATATT GGGACCCCCA ACCTTCTTCA 840 

GC TAGTTGTTGA TCTCACTCAC ATAGTTGAAA ATCATCATAC 9 00 

G T T AT C AAG AA AGGGTCAAGA GAAGTCAGGC TCATTCCTTT 96 0 

GT T AAA C A C A T C AATCCCCCTC ATATTCCATT GACTAGAATT 102 0 

AA GTGCAAGGAA ATCTGGAAAA TATAATCTTT ATTCCAGGTA 10S0 

GCCATATGAC TCTTTAAAAT TCAGAAATAA TATATTTTTA AAATATCATT CTGGCTTTGG 114 0 

TATAAAGAAT TGATGGT GTG GGGTGAGGAG GCCAAAATTA AGGGTTGAGA GCCTATTATT 12 0 0 

TTAGTTATTA C AA G AAAT G A TGGTGTCATG AATTAAGGTA GACATAGGGG AGTGCTGATG 12 6 0 

AG GAGCTGTG AATGGATTTT AGAAACACTT GAGAGAATCA ATAGGACATG ATTTAGGGTT 132 0 

GG ATTTGGAA AGGAGAAGAA AGTA G AAAAG ATGATGCCTA CATTTTTCAC TTAGGCAATT 13 SO 

TGTACCATTC AGTGAAATAG GGAACACAGG AG G AAG AG C A GGTTTTGGTG TATACAAAGA 14 4 0 

GGAGGATGGA TGAGGCATTT CGTTTTGGAT CTGAGATGTC TGTGGAACGT CCTAGTGGAG 1500 

ATGTC CAGAA ACTCTTCTAC ATGTGGTTCT GAGTTCAG3A CACAGATTTG GGCTGGAGAT 1560 

AGAGATATTG TAGGCTTATA CATAGAAATG GCATTTGAAT CTATAGAGAT AAAAAGACAC 162 0 

ATCAGAGGAA ATGTGTAAAG T GAG AG AG G A AAAGCCAAGT ACTGTGCTGG GGGGAATACC 16 3 0 

TACATTTAAA GGATGCAGTA GAAAGAAGGT AATAAACAAC AGAGAGCAGA CTAACCAAAA 174 0 

GG GGAGAAGA AAAACCAAGA GAATTCCACC GA 2TCCCAGG A GAGCATTTC AAG AT T G AG G 18 00 

GGATAGGTGT TGTGTTGAAT TTTGCAGCCT TGAGAATCAA GGGCCAGAAC ACAGCTTTTA I86 0 

G ATT TAG 3 AA CAAGGAGTTT GGTGATCTCA GTGAAAGCAG CTTGATGGTG AAATGGAGGC 192 0 

AGAGGCAGAT TGCAATGAGT G AAA C AG T G A ATGGGAAGTG AAG.AAATGAT ACAGATAATT 198 0 

CTTGCTAAAA GCTTGGCTGT T AAAAG G AG G AG AG AAA C AA GACTAGCTGC AAAGTGAGAT 2 04 0 

TGGGTTGATG GAGCAGTTTT AAATCTCAAA ATAAAGAGCT TTGTGCTTTT TTGATTATGA 2100 

AAATAATGTG TTAATTGTAA CTAATTGAGG CAATGAAAAA AGATAATAAT ATGAAAGATA 216 0 

AAAAT AT AAA AACCACCCAG AAATAATGAT AGCTACCATT TTGATACAAT ATTTCTACAC 222 0 

TCCTTTCTAT GTATATATAC AGACACAGAA ATGCTTATAT TTTTATTAAA AGGGATTGTA 22 3 0 

CTATACCTAA GGTG CTTTTT CTAGTTAGTG ATATATATGG AGATCTCTCC ATGGCAACGA 2 340 

GTAATTGCAG TT AT ATT AAG TTCATGATAT TTCACAATAA GGGCATATCT TTGCCCTTTT 24 0 0 

TATTTAATCA ATTCTTAATT GGTGAATGTT TGTTTCCAGT TTGTTGTTGT TATTAACAAT 246 0 
G T T C C C AT AA GCATTCCTGT ACACCAATGT TCACACATTT GTCTGATTTT TTCTTCAGGA 2 520 
TAAAACCCAG G AG G T A G AA T TGCTGGGTTG AT A G AAG AG A AAG GAT GAT T GCCAAATTAA 2 5 80 



AGCTTCAGTA GAGGGTACAT G 
CACTTTCTCA TTTCCCCTTG G 



CGAGCACA AA T GG GAT C A GGCCTAGATA CCAGAAATGG 2640 
G AC AAAAG G G .A GAG AG G C A ATAACTGTGC TGCCAGAGTT 2700 



AAATTTGTAC GTG GAGTA.GC AGGAAATCAT TT 3CTGAAAA TGAAAACAGA GATGATGTTG 2 760 

TAGAGGTCCT G AAG AG AG C A AAG AAAATT T GAAATTGCGG CTATCAGCTA TGGAAGAjGAG 2820 

TGCTGAA GTG G AAAA C AAAA GAAGTATTGA CAA7TGGTAT G GTTGTAATG GCACCGATTT 28 30 

GAACGCTTGT GCCATTGTTC ACCAG GAGCA CTCAGCA.GCC AAGTTTGGAG TTTTGTAGCA 2940 

G AAAG AC AAA TAAGTTAGGG ATTTAATATC CTGGGCAAAT G G TAG AC AAA ATGAACTCTG 3 000 

AG AT CC AG GT GCACAGGGAA GGAAGGGAAG ACGGGAAGAG GTTAGATAGG AAATACAAGA 3 060 

GTCAG GAG AC TGGAAGATGT TGTGATATTT AAGAACACAT A GAGTTG GAG TAAAAGTGTA 312 0 

AG AAAA C TAG AA G G G T AA G A GACCGGTCAG AAAGTAG3C7 ATTTGAAGTT AACACTTCAG 318 0 

AG ■ G C AG A G T A GTTCTGAATG GTAACAAGAA ATTGAGTGTG C 2TTTGAGAG TAG G T T AAAA 3240 

AAGAATA GGC AACTT7ATTG TAGCTACTTC TGGAACAGAA GATTGTCATT AATAGTTTTA 3 3 00 

G AAAA C T AAA ATATATAGCA TACTTATTTG TCAATTAACA AAGAAACTAT GTATTTTTAA 3 360 

ATGAGATTTA ATGTTTATTG TAG 3 3 83 

(2) INFORMATION FOE SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11464 base pairs 

(E) TYPE: nucleic acid 

(C) STFANDEC-NESS : double 

(D) TOPOLOGY: linear 

(n) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : hurrar; 

(F) TISSUFJ TYPE: placenta 

(iX) FEATURE: 

( A. ) NAME / KE Y : 5 ' ITS I: 

( B ) LOCATION : 1 . . 3 

(C) IDENTIFICATION METHODS: E 

(A) NAME/KEY: leader peptide 

(B) LOCATION : 4 . . 82 

(C) IDENTIFICATION METHODS: S 
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(B 
(C 
(A 
(3 
(C 
(A 
(3 
(C 
(A 
(3 
<C 

(a: 

(B 
(C 
(A 
(3 
(C 
(A 
(3 
(C 
(A 
(3 
(C 
(A 
(B 
(C 
(A 
(B 



NAM E / KE Y : i n t r o n 
LOCATION : 83 14 53 
I DENT I F I CAT I ON METHODS: E 
NAM E / KE Y: leader peptide 
LOCATION : 1454 .. 1465 
IE'ENTI F I CAT I ON" METHODS: S 
NA.ME /KEY: i n 1 r o n 
LOCATION: 1466. .4848 
IDENTIFICATION METHODS: E 
NAME /KEY: leader peptide 
LOCATION: 4849. .4865 
IDENTIFICATION METHODS: S 
NAME /KEY: mat peptide 
LOCATION: 4866.. 4983 
IDENTIFICATION METHODS: S 
NAME /KEY : intron 
LOCATION: 4984. .6317 
IDENTIFICATION* METHODS: E 
NAME /KEY: mat peptide 
LOCATION: 6318. .6451 
IDENTIFICATION' METHODS: S 
I j AMS / KEY : intron 
LOCATION: 6452. .11224 
IDENTIFICATION' METHODS: E 
NAME /KEY: mat peptide 
LOCATION: 112 2 5 . . 1144 3 
IDENTIFICATION METHODS: S 
NAME /KEY : 3 1 UTP. 



LOCATION : 



11^ 



:dentificatio: 



i . . 11464 
METHODS 



(xi) SEQ'JENCE DESCRIPTION" : SEQ ID NO: 13 



AAG ATG GCT GCT GAA CCA GTA GAA GAC AAT TGC ATC AAC TTT GTG GCA 4 8 
Met Ala Ala Glu Pro Val Glu Asp Asn Cys lie Asn Phe Val Ala 
-35 -30 * -25 

ATG AAA TTT ATT GAC AAT ACG CTT TAC TTT ATA G GTAAGG CTAATGCCAT 9 8 
Met Lys Phe lie Asp Asn Thr Leu Tyr Phe lie Ala 
-20 ' -15 -10 

AGAACAAATA CCAGGTTCAG ATAAATCTAT TCAATTAGAA AAG AT G T T G T GAGGTGAACT 158 

ATTAAGTGAC TCTTTGTGTC ACCAAATTTC ACTGTAATAT TAATGGCTCT TAAAAAAATA 218 

GTGGACCTCT AGAAATTAAC C A C AA C AT 1 3 T CCAAGGTCTC AGCACCTTGT CACACCACGT 278 

GTCCTGGCAC TTT AA.T C AGO AGTAGCTCAC TCTCCAGTTG GCAGTAAGTG C AC AT C ATG A 338 

AAATCCCAGT TTTCATGGGA AAATCCCAGT TTTCATTGGA TTTCCATGGG AAAAAT C C C A 3 98 

GTACAAAACT GGGT3CATTO AGGAAATACA ATTTCCCAAA GCAAATTG3C AAATTATGTA 458 

AGAGATTCTC T AAAT T TAG A GTTCCGTGAA TTACACCATT T T AT G T AAA T ATGTTTGACA 518 

AG T AAAAAT T GATTCTTTTT TTTTTTTTCT GTTGCCCAGG CTGGAGTGCA GTGGCACAAT 578 

CTCTGCTCAC TGCAACCTCC ACCTCCTG3G TTCAAGCAAT TCTCCTGCCT CAGCCTTCTG 63 8 

AGTAGCTGGG ACTACAGGTG CATCCCGCCA TGCCTGG CTA ATTTTTGGGT ATTTTTACTA 6 98 

GAG A C AG G 3T TTTG3CATGT TGTCCAGGCT GGTCTTGGAC TCCTGATCTC AGATGATCCT 758 

CCTGGCTCGG G CTCCCAAAG TGCTG 3GATT ACAGGCATGA ACCACCACAC ATGGCCTAAA 818 

AATTGATTCT TATGATTAAT CTCCTGT3AA CAATTTGGCT TCATTTG.AAA GTTTGCCTTC 8 78 

AT T T G AAA C C TTCATTTA-A A3 2CTGA3CA ACAAAGTGAG ACCCCATCTC TAC AAAAAA C 938 

TGCAAAATAT CCTGT3 3ACA CCTCGTACCT TCTGTGGAGG CTGAA 3CAGG AGGATCACTT 998 

GAGCCTAGGA ATTTGAGCCT G2AGTGAG2T ATG ATC CC AC CCCTACACTC CAGCCTGCAT 1058 

GACAGTAGAC CCTGACACAC ACACA2AAAA AAAAACCTTC AT AAAAAAT T ATTAGTTGAC 1118 

TTTTCTTAG3 TGACTTTCCG TTTAAGCAAT AAAT TT AAAA GTAAAATCTC TAATTTTAGA 117 8 

AAATTTATTT TTAGTTACAT ATT3.AAATTT TTAAACCCTA GGTTTAAGTT TTATGTCTAA 12 3 8 

ATTACCTGA 3 AACACACTAA GTCTGATAAG CTTCATTTTA TGGGCCTTTT G GAT GAT TAT 12 9 8 

ATAATATTCT GATGAAAGCC AAGACAGACC CTTAAACCAT AAAAA T AG G A GTTCGAGAAA 13 58 

GAGGAGTAGC AAAA GT AAAA GCTAGAATGA GATTGAATTC TGAGTCGAAA TACAAAATTT 1418 

TACATATTCT GTTTCTCTCT TTTTCCCCCT CTTAG CT GAA GAT GAT G GTAAA 14 7 0 

Ala Glu Asp Asp Glu 
-10 

GTAGAAATGA ATTTATTTTT CTTTGCAAAC TAAGTATCTG CTTGAGACAC ATCTATCTCA 153 0 

CCATTGTCAG CTGAGGAAAA AAAAAAATGG TTCTCATGCT ACCAATCTGC CTTCAAAGAA 1590 

ATGTGGACTC AGTA3CACA3 CTTTGGAATG AAGATGATCA TAAGAGATAC AAAGAAGAAC 1650 
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CTCTAGCAAA AGATGCTTCT CTATGCCTTA AAAAA T 7 C 7 C CAGCTCTTA G AATCTACAAA 1710 

A7AGAG777G CCTGTTTCAT T3GTCCTAAG AT TAG CATGA AG GCATGGAT TGTGTTGTAG 1770 

GGGGA GCGTT G G AT AG G AAA AAGGGATTGA AG GAT TAG. A. A TTGTCCAAAA TCAGTAACAC 18 3 0 

CTCCTCTCAG AAATGGTTTG G3AAGAAGGC TGGAAGGTTC CG3 3TTGGTG GTGGGGTG3G 18 90 

GCAGAAAA77 CTG3AAGTAG A 3 3 A 3ATAGG AA7GGG7GGG G 1 3 AA » 3 A A G A 3 CAGATTCAGA 19 50 

GG GCAAAAGC 7GAAA3AAAC CATG 3GATTT ATGATGAATT CAG3GTAATT CAGAA7GGAA 2 010 

GTAGA3TAGG A3TA3 3 AG AG TG 3TGAGA3 3 A 3 3TA3AGT 3 AT AAA G AG G 3 7G7AGAGGAA 2070 

GACGTTCTCT CACCCGAAGA T G7GAAA777 GG ACTTTATG TTG 3 AG AT A A TAGGGTTAAT 213 0 

7AAGCACAA7 ATGTATTA3 3 T A 3 G 3TAAAG ATTAGTTTGT TGTAACAAAG ACATCCAAAG 2190 

ATACAGTAGC TGAATAAGAT AGAGAATTTT TGTCTCAAAG AAA3TCTAAG TAGGGAGGTC 2 2 50 

AGAAGTAGTA TGGCTGGAAG CAACC7GA7G AT AT TG GGAG CCCCAACC77 CTTCAGTCTT 2 310 

GTAGCGATCA TCGGGTAGTT GTTGATCTCA CT2AGATAGT 7GAAAA7CAT GATAGTTGCT 2 370 

GGGTTGATAT CCGAGTTATG AA3AAAGGGT C.AAGAGAAGT CA3 3GTGATT CC777CAAAG 24 3 0 

A C T C T AA TTG GAAGTTAAAG ACA7CAA7CC CGGTCATATT CGATTGAGTA GAATTTAATC 2490 

AGATG 3GGAG ACGAAGTGGA AG3AAATCT3 GAAAATATAA TCTTTATTCC AGGTAGC 2AT 2 550 

ATGACTGTTT AAAATTGAGA AATAATATAT TTTTAAAATA 7CA77C7GGC TTTGGTATAA 2 610 

AGAATTGATG GTGTGGGGTG AGGAGGCGAA AATTAAGGGT TGAGAGCGTA TTATTTTAGT 26 70 

TATTAGAAGA AATGATGGT 3 TGATG.AATTA AG 3TAGACAT AGGGGAGTG 3 TGATGAG3AG 2 73 0 

CTGTGAATG 3 ATTTTA3AAA CAGTTGAGA 3 AATCAATAGG AGATGATTTA GGGTTGGATT 2 7 90 

TGG AAA 3G AG AA G AAA G T A. G AAAAGATGAT GGGTACATTT TTCAGTTAG 3 CAATTTGTAC 2 85 0 

CATTCA3TGA AATAGG 3AAG ACAGGAGGAA GAGGAGGTTT TGGTGTATAG AAAGAGGAGG 2 910 

ATGGATGAGG CATTTGGTTT TGGATCTGAG ATGTCTGTGG AAGGTCCTAG TGG AG AT GTC 2970 

C A G AAA 3 T C T TGTAGATGTG GTTCTGAGTT CAGGAGACAG ATTTGGGCTG GAGATAGAGA 3 03 0 

7A77G7AG 3 3 TTATAGATAG AAATGGCATT TGAATCTATA G AG AT AAAAA G ACACATCAG 3090 

A 3GAAATGTG TAAAGTGAGA G A G 1 3 AAAAG G CAAGTACTGT GGTGGGGGGA ATAGCTACAT 3150 

TTAAAG3ATG CAGTAGAAAG AAG3TAATAA ACAACAGAGA GGAGAGTAAG CAAAAGGGGA 3210 

GAAGAAAAA3 CAAGAGAATT CCAGCGAC7C CCAGGAGAGC ATTTGAAGAT TGAGGGGATA 3270 

G3TGTT3TGT TGAATTTTGG AG GG TTG AG A ATCAAGGG 3G AGAACACAGC TTTTAGATTT 3 33 0 

AGGAAG.AA3 3 AGTTTGGTGA 7C7CAG7GA_A AGGAGGTTGA TGGTGAAATG GAGGCAGAGG 3 3 90 

CAGA77GCAA 7GAG7GAAAC AG7GAA7GGG AAG7GAAGAA ATGATAGAGA 7AA77C77GC 34 5 0 

TAAAAGCTTG GGTGTTAAAA GGAG 3 AG AG A AACAA3ACTA GGTGG AAA GT GAGATTGGGT 3 510 

TGATGGAGGA GTTTTAAATG T C AAAAT AAA GAGGTTTGTG CTTTTTTGAT TATGAAAATA 3 57 0 

ATGTGTTAAT TGTAAGTAAT TG A GGGAAT G AAAAAAGATA ATAATATGAA AGATAAAAAT 363 0 

AT AAAAA GC A CGCAGAAATA ATGATAGGTA CCATTTTGAT ACAA7A777C TACACTCCTT 36 90 

TGTATGTATA TATACAGACA C AG AAATG C T TATATTTTTA TTAAAAGGGA TTGTAGTATA 3 750 

GGTAAG GT 3 3 TTTTTCTAGT T AG T GAT AT A TATGGAGATG TCTCCATGGG AACGAGTAAT 3 310 

TGGAGTTATA TTAAGTTCAT GATATTTGAG AATAAGGGCA TATGTTTG GG CTTTTTAT77 3370 

AATCAATTCT 7AA77GG7GA ATGTTTGTTT CCAGTTTGTT G77G77A77A ACAA7G77CC 3 93 0 

CATAAGCATT CGTGTAGA3G AA7G7TCACA CATTTGTCTG ATTTTTTCTT C A. G G AT A AAA 3 990 

CCCAG3A33T AGAATTGCTG GGTT3ATAGA AGAGAAAGGA TGATTGGGAA ATTAAAGGTT 4050 

CAGTA3A3G3 TACATGGG 3 A GCACAAATGG GATGAGGCCT AGA7ACCAGA AATGGCAGTT 4110 

TGTCATTTGG CC77GGGAGA AAAG 3GAGAG AG G C AA7 AAC 7G7GCTGCCA GAGTTAAATT 4170 

7GTACGT3 3A GTAG 2AGGAA ATCATTTGGT GAAAATGAAA ACAGAGATGA TGTTGTAGAG 4 23 0 

GTGGT 3. A A 3 A G A G G AAA 1 3 A_ A AATT TG AAA T TGCGGCTATC AGCTATGGAA GAGAG7G 27G 4290 

AA G 7 G 3 A.AAA C AAAAG AA GT ATT 3 AC AAT T GGTATGCTTG TA-ATGGCACC GATTTGAACG 4 35 0 

CTTGTGCCA7 TG77CACCA 3 C AG GAG 7 GAG CAGCCAAGTT TGGAGTTTTG TAGCAGA"J\G 4410 

AGAAATAAGC 7AGGGA777A ATA7 3CTG3C CAAA7GG7AG AC AAAA7G .AA CTGTGAGATC 4470 

GAG 37GCAGA GGGAAGGAAG GGAAGACGGG AAG AGG77AG A7AGGAAA7A C AAG AG 7 G AG 4 53 0 

GAGAC7GGAA GA7G7TGCGA 7 A 7 7 7 AA ' G .AA C A 3 ATA G AG T TGG- A 3TAAAA G T G T AAG AAA 4 5 90 

AC TAG AAG GG T AAG AG AG GG GTC A G AAA G T 7vGGCTAT77G AAG7TAACA3 77CAGAG3CA 4650 

G AG TAG T T ■' ' T GAA7G GT AA 3 AAGAAATT3A G7G7GGC777 GAGAGTAGGT TAAAAAACAA 4710 

7AGGCAAC77 TA7TG7AG GT ACT7C7GG.AA C 1 AG AAG A7 7 G 7CA77AA7AG 7777AGAAAA 47 7 0 

CTAAAATATA 7AGCATAC77 A.7T1 G7CAA7 7 AAC AAAG AA AC7A7G7A77 77TAAA7 GAG 4830 



A771 


AAATG 


3 7 TA77G7AG 


AA 


AAG 


G TG 


GAA 


7CA 


GAT 


7/AC 


777 


GGC 


/■jag 


CT7 


4880 










Glu 


Asri 
-5 


Leu 


Glu 


Ser 


Asp 


- V 1 
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Phe 


71 y 


Lys 


Leu 
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GAA 


TCT 


AAA 


77 A 7CA 


GTC 


ATA 


AGA 


AA7 


77G 


AAT 


GAG 


CAA 


GTT 


CTC 


77C 


4928 


Glu 


Ser 


Tys 


Leu Ser 


Val 


He 


Arc: 


Asn 


Leu 


Asn 


Asp 


Gin 


Val 


Leu 


Phe 
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A77 


GAG 


CAA 


GGA AA7 


CGG 


CCT 


GTA 


777 


GAA 


GA7 


A7G 


ACT 


■GAT 


TCT 


GA3 


4976 


lie 


Arp 


Gin 


GLy Asn 
2 5 


Arg 


Pro 


Leu 


Phe 

30 


Glu 


Asp 


Met, 


Thr 


Asp 
35^ 


Ser 


Asp 




7G7 


AGA 




GTATTTTTT TTAATTGGGA AACA7A< 


3 AAA 


7GACTAGCTA 


CTTCTTCCCA 


5032 


Cys 


Arg 


Asp 
4 0 





























TTCTGTTT FA CTGCTTACAT TGT7CCG7GC TAGTCGCAAT C G T 1 3 AG AT G A AAAGTCACAG 5092 
GAGTGACAAT AATTTCACTT A C AG G AAA C 7 TTATAAGGCA TCCACG7TTT TTAGTTGGGG 5152 
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T AAAAA^A iT j 
ATCACCAAT3 
AATCCCTGCT 
GTAATCCTA j 
CTGTAGTACA 
AGACCTT3T3 
AAGTCTA3TG 
AGTGAATGT3 
TATTTAGCAT 
AAA G T T G C A G 
ATGCAGTTTC 
CTCACTACAA 
GTGG 3 ATT A 3 
GGGTTTCACG 
TCAGCCTCCC 
CAACTTTTAT 
CTGGGATTAC 
CCA3CTGTAC 
G3ACTAT2CT 
TTCTCATTTA 



G AT A C AAT AA 
CCTTTATTGT 
TGTTACAGCT 
CTACTTG 3GA 
CTGTGATCGT 
TCTAAAATTA 
TGCCTTCCAA. 
CATTCTTTAA 

TTAAAAGTTA 
CGTGTGTGTT 
ACTCTGT CAC 
CCTCCACGTC 
AGGCATGCAC 
ATGTTGGCCA 
AAA C AAA C A A. 
GAGTATTTTA 
AGGCATGAGC 
ATAGTCTCCT 
TTGG TT AAAT 
T TAT ATT TAT 



GAGATTG 3TA 
GATTG 3ATTA 
G AAAA TG GTG 
GGCT3 AAGCA 
A.CCTGT3AAT 

AA 3ATG AATT 
AAA. T A 3 T G AA 
AAAATAATCT 
GT AAT AC AC A 
CCAGG3TGAA 
CCACGTTCAA 
CACTTACA3C 
GGCT 3GTCTC 
ACAACCCCAC 
ATGATATAGA 
CACTGTGC 2 A 
GCAGA3TG3C 
TT 3CG 3 AAAT 
TTC/ 



GG 3GT 3 AT 3 3 CTCT :T 3AGC 
AC TGT TT AAA ACCTCTATAG 



ATA 
He 

TCT 
Ser 
65 
ATT 
He 



AGT 

Ser 

50 

GTG 

Val 



ATG 
Met 



TAT 
Tyr 

TGT 
Cvs 



AAA 
Lys 

GAG 
Glu 



GAT 
Asp 

AAA. 
Lys 
70 



AG ; 



ATT 
He 



AT 
Asp 
40~ 
CAG 
Gin 

TCA 
S^r 



ATAGTTTACC 
GGAG 3ATTG 3 
A GCCACT 3 3 A 
AAA AAA/JAA 3 
C 3 AAAT A T C A 
TACTTACCTT 
TTTAGAATTC 
TTAAACTGTG 
GTGCA3TG 3 A 
GCGATTCTTA 
CGGCTAATTT 
AAACCCCTAA 
AGTTTAATAT 
T T AT AAAA 3 G 
G3 3CTGAA2T 
CAAGT3TCAA 
GTTCCTGT 3C 
GCA CCC 
Ala Pro 



AAT 
Asr. 



AG 3TGT jGT'j 
T T 3 A 3 3 3 3 AG 
CTCCA 3 3CTG 

AA3TTA3 3CT 
AA C AT AT AT T 
AT AT 3TTT AA 
GG3TT3TTTG 
GT 3 3A3TGGT 
TG3CT3A3TC 
TT 3TATTTTT 
CCTCAAGTGA 
GTGTTACAAC 
TTGTTTTTAA 
GTGTTTTTAA 
AGTGGGAACA 
AA3AATTCTT 
CGG ACC AT. 
Ara Thr 
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1 <w l _ \_ 


COl'i 
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TCTG 3 3TGCC 


C Q Q ") 

d y y - 


ACACATG2TG 


6052 


C T T T T AAA- T G 


6112 




6172 


GGTGTATTAA 


6232 


CTAACTAGAG 


6292 


^ TTT ATT 


6343 



He Phe lie 



CCT 
Pro 

ACT 
Thr 



AG A 
Ar q 



GGT 
Gly 



CTC 
Leu 



iii 

Z 1 p » 



TCC 
Ser 
ATATAATTAG 
CAAA.TATCCT 
AAAGTGAATA 
CTGAGCCTGT 
ACA.ATCAGTC 
T AA. < 3 A.T G T G A. 
ATTTCTACAC 
CTTAG 3CT.AA 
T AA TCT AAT T 
AG C CAG G TAT 
CTATTATTTT 
AGACTCAAG 3C 
TTCCATGTCA 
CCA.CC CA.GTC 
ACTTGGTAGG 
AA C AG ATA C A 
TTCAGGTGA3 
G C AT A AG h 3 j 
AA C AT AAT T A 
A. 3 G T G C AG A 3 
AG 1 3 G A. 3 j- \ 3 1 -j 

gggaaatcca 
aaagaatgtg 

G'-J jj'j'.j-^G _rA i 

ATCTCTA3TA 
CT3GG3A332 
AG AT- 33 TG 3C 
AAAAAAAAAG 

CAGCATCATA 
GAGAGTGAG 3 
CTAATAAG 3A 
CAGG 2CAG 3 3 
TCTCACTTGA 



GTAAG ACT' 



AA! 
Lys 
AAAT AT AA.C A 
CAGACCAACC 
CTTACTAAAA 
CACAGGGGAA 
TTTATACAAA. 
CTTTCCAGAA 
CTTTGT.AAAT 
GTCTTAGACA 
GAATAAAAGT 
AAAG TAT r i T C 
TCTCTATTTC 
AGTAAGA3TA 
TGAAiGACTCT 
CCCACTGAAA 

GCCCCCAGAC 
AATT TGG. AGT 
CCTGGGATG 3 
GAAG 3GAAG 3 
G 3 AG AT T C AG 
GTG AAA 
TATTT 



T3C 
Se r 
75 

fTTGTTTT 



Met 
6C 
TGT 
Cvs 



GCT 
Ala 

GAG 
Glu 



GTA ACT 
Val Thr 



ATC 

He 



AAC 

Asn 



AAA. 
Lys 



CAA7CA.TG1 



ATT 
He 

& 0 

AATATAACCA 



j. - 1 L l i-A. 

TTTTGTCTAG 
AT T AT C.AA_A C 
GAG3AGATAC 
T^-vA - ^l^-.T 1 ^ 1 
TC'A. 33 , 3"CTG 3 

CAAGCTTCAG 
T AT ■ 3 A.G.AT C A 
TGGCCTCTAC 
CTCCATTATT 
G 3 13 AG G 3 AT 3 
TTTTGA.GTG 3 
GAC A3TTAJ3 G 
CCACTCTAAA 
A.AATCCCTCA 
CCCCA.TTC TA 

.3' J J « Aj? 1 j ■■ J 

TAAGAAGATG 



ATA AG T AATG T AA TTAGAAAA.CT 
:AGAAATA A3AAG.AAGCA GAGAACCATT 



TCTTTACCTA 
AA.CACTTGTT 
G /--AT A. CAT AT 

T T 3' T A_AT C C C 



j. 1 



GCTGGG 3>3TG 
CACCT'3AAGT 
AAAA T A 3 AAA 
TG/iG 3 2 A :j jA 
ATTG3ACTCC 
TG AAAT T AAC 
TGGCTGGAAG 
GCAAATCTGC 
G GTG GAG TAG 
TCTTAG 3A3T 
ACAGTGGCTC 
GAT CAGG AGT 



A/3 



A ~: A 



C 

G (_ TG7A"--A'.t 
TTTTTCTCTA 
GTTA-3ATAAA 
CTTA3AAATT 
AGATT7G3CA 
ATAT3AC3TT 
AT AT AAT CCA 
GCTATCTGCC 
GA3CTGA3AG 
TGG.AA-AG 3 3T 
G :TC/^AG3TA 
AA3TCCG.^C 
lA33AAT 3C 



G GA 



.3 3 _ 1 



CTAAAAAAAA TAG AAAA ATT 
GTGAGC CTGA GGCAGAAGAA 
TCGCACCACT GCACTCCAG3 



ATT .--l 3 _ TG ,:3 
GAAT 2TTTTG 
AG TCTG 3G 3 A 
CAAAG3CATT 
AGATCTGTGT 
TTCTG3AA3G 
GACC A 3TTTT 
G 1 3 T T A T AAAA 
ATG3CTATAA 
T3AAGA3CAG 
AGCCAGGCAT 
TCGCTTGAAA 
CTGG 3 3GACA 



CCCGTAATGC 
A 3A':CA3 33T 
CGTGGTGGZA 
AAC3CG 3GA3 
ACAAGAGCAA 
AGGTTAATAA 
A.AA T G A G G 3 A 
A- A CTC AA T AA 
A. 3 3CCTTGTC 
GTG 3CCTAGG 
T 2CCAGCACT 
CCTGG 3CAGC 
G 3TGGCATGC 
C C AG 3 A.'jj- j 1 G 
G AATG A 3 ACT 



GATGGTTTTT 
CAT CTCCTG A 
CATTTAAGAA 
ATC3TT3TAT 
GTTGCTGATC 
TTTTTAATGT 
ATTATCTTCA 
TTATTATTCT 
CTAT.AG 3TAC 
CAGAGGAGAA 
GCTTTCATGC 
CCAAGGGGCA 
GTG 3ATATGC 

gtg 3ca3ccc 

gt tact 3. aac 
agg;jaaactc 
attcttgttg 

GTA3GAA 3TG 

GGAGG3C : 3A3 
GAGAAA3CCC 
ATCCCA3CTA 
CGATGAGCCT 
C AA.AAAA AAA 
TTTTT.AAGTA 
TAA3CTTCAT 
GAG3G3G3 3A 
TTTCCTG TCA 
TAA 3ATACAA 
AAGGCGAGTG 
TCTGTCTCTA 
CCAG 3TACTC 
TGAG 3TGAGA 



TTGTGATAAT 
TTATGACCTG 
GTGAGTTA3TA 

TAGTTGTTTT 
AT G T ATG T T A 
T.AATGCTATA 
TTATT'CTCCj-i. 
C C A C A A T T AA 
GGCAATGCTT 

AGTGAAGGTA 
AGT A-— t jjs-^j-^'— .-^ 
T C 3 AA 3 C A 3 A 
GCAG CTTA3T 
T.'-i-^G TaTG 3T 
TC-TG 3 3 AT .AG 
CTACA3GTGG 
TTGG 3ACTTA 
TTGAT 3G3CC 
CAG 3A 3TTTG 

TATG T3TGTA 
G GA'jA j'j 1 1 j 
AACTCGGTCT 
T * T T AA T A C T : 3 
A.T' 3 TGAG ATT 
ATATT AGTTG 
TTT AATG CCT 
T T ■ 3 T A G A T A„A 
TTGG TAGGGC 
ATG3CGATAC 
A3CTGTAATC 
TAG 3 TTGCAG 
T T G T C T C AAJ\ AAAA G AAAAA 



639: 



6439 



6495 

6556 
6616 
6676 
6735 
6795 
6S56 
6916 
6976 
7036 
7095 
7156 
7216 
7276 
7336 
7395 
7456 
7516 
7576 
7636 
7 6 9 6 
7756 
7 S 1 6 
7876 

7 93 6 
7996 

8 0 5 5 
8116 
8176 
8236 
8296 
8356 
8416 
8476 
8536 
8596 
8656 
8716 



GATACAACAG GCTACCCTTA TGTGCTCACC TTTCACTGTT 
TATAAA.GTTC TTTGGTCAAG AACCTTGACA A 2 ACTAAGAG 
CTGTCAGAGT CTGTTTCATA TAT AT AG AT A T.A3ATGTATA 



j i CL 



TGGCCAGGGT 
TCCCTG 3 ATT 
CCTTTCCCCT 
AAGGTCATTG 
GTG 3TCAACA 
ATGTGCAATA 
ACTGTAACTT 
TCTGTCGCCC 
GGGTTCACGC 



TO CCTCAGAC 
CAGATTCAAC 
TGGAGCACTC 
GGATTGCTTT 
TCAAAACTAG 
AGTGTGATTA 
TCTTTTTTTC 
AG G C TAG AGT 
CATTCTCCTG 



TTTCCA3TGC 
CCCTTCTGAT 
AAGTTTCACC 
CACAT CCATT 
GAAAGCTACT 
AAGAGATTGC 
TTTTTTTCTT 
GCAGTGG CAC 



A 3TTGG 3AGA 
GTAAAAAAAA 
A3 3TGGG 3CT 
TG3TATGTAC 
G C C C AA G G AT 
CTGTTCTAGC 
TTTTTCTTTT 
GATCTCAGCT 



G ATT ACT AG G TAT AAA/ 
GGATTTGCTT TGAGAG( 
TATGTATCTA TATCCAGGCT 
TGTTAGGTCA ATATCAACTT 
AAAAAAAAA 
TTCCAA3TT 1 
CTT3CCTAT' 



CCATGCCCAG CTAATTTTTT 
ATGGTCTCGA TCTCCTGAAC 



CCTCACCCTC CC AAGCAGCT 
GTATTTTTAG TAGAGACGGG 
TTGTGATCCG CCCGCCTCAG 



ACAGGCGTGA GCCATCGCAC CCGGCTCAAC TGTAACTTTC 

TGTAATGTTA CTAGAGCTTT TGAAGTTTTG G 3TATGGATT 

AGTTCCAAAT TGATGCCCAC 

GCAGAAGTGG GTGCCAATAG 

T G T T G AT AAA GAACAAAGGT 

TGGAGATAAC CCCGTGACCT 

AAATGCTATT TTAATTTTGG 



ATTTCA3ATT 
GTAGACAGCT 
GACCCACACT 
ATTGAG AA 3T 
AA G G A T G AA G 
GAATCTGTG2 
CATGGAA3 AA 
G C A C T AA C AG 
CAAGTAATCT 
CATTTG GCCT 
ACTAC T AT G 3 
AAT AC A 3 C A 3 
C A C T GT AA. G T 
TGTCTCTCTC 
CCTGGAATCC 
GGAC CAGCTT 
T AAAAAT TAG 
CAGGGG 3ATT 
ACTTCTG3CT 
AC T AG C C T AA 



GTCCTTACCT 
TATCCACACT 
TTTTTGAAAC 
CA2IGCAA3 3 
GG3ACTACAG 
GTTTCACCGT 
CCTCCCAAAG 
TATACTGGTT CATCTTCCCC 
ATTTCTCATT TATACATTAG 
CTCTTCCTAA ATTGTATATT 
TTATACTTTC AT C AA C T TAG 
GACTACTGAT TCCACAACTG 
GAGTCTTTCA GGCATCTTTG 



GGGGTTCTCC 
ATGG 3TGGGA 
CTATTCTGAA 
CTCGCTTTCA 
GGAGTCTCGC 
TCTGECT 3CC 
GCG 3CTGCCA 
GTTAGCCAGG 
TGCTGGGATT 



AG CTTAGGGT 
GG 3AACTAGT 
CAAGAGTTAT 
CTGCCATCCA 

A3 3TTTCTCT ATCAGTGCTT AGGATCATGG 



TGCCATGAGG 
CCTTAGGTGG 
GAAAAGTGCA 
AACCATTTCT 
TCTAAGGCAG 
AACTGGAGTG 
G C T T A C AC AG 
TTTAATTTCA GGCTCCA 



AGG CAA.CAAA GTGA3ATA.CC 
CCAAATGTGG 
GCTTGAGCCC 
GGGCAACAGA 
GTTTGTGGGA 



AGG AC A. 3 AAA TTGACATTAG 

T T G A G C AAAT GT3GACACCA 

G G G AAA J 1 . 3 1' A GAGGCCTG 3 A 

GA.GTTAAGAA GG AG AGG AGG 

AG AAG C AAAA CTACTTTTGT 

AAATTGTTCA TGTCCTGAAA 



C C AAAATT AA GTCCAAAACA TCTACTGGTT CCAGGATTAA 
TGCCCACATG TTCTGATCCA TCCTGCAAAA TAGACATGCT 
GGCAGCACTA CC AGTTGGAT AA.CCTGCAAG ATTATAGTTT 
CACAAGGCCC TATTCTGTGA C T G AAA. C AT A CAA3AATCTG 
G3CCCAGCCA A33AG.ACCAT A.TTCA.3GACA GAAATTCAAG 
CTTGGCAGGG AA3ACAGA3T CAAGGACTGC CAACTGAGCC 
GAACCCAjGGG CCTAGCCCTA CAACAATTAT TGGGTCTATT 
AAA GAG T AAG CTAAGATTCC TGGCACTTTC 
GCCAG3CATG GTGGCTTACA 
CACTTGA.GGC CAGGA.GTTCA 
TTCTCTACAA AAATAAATTT 
AGCTACTCAG GAGGCTGAGG 
AG CTATGATT TCACCA.CTGC 
AAAAA3AAAA AGAAACTAGA 
GCCGTGAATG GTTATTATAG 
TG 3TG3AACT CTA.CTTAATC 
3TGTT T GAT G TAT AG 
5 GAG AAT AGGAGATTCG 
7TTTTAT GTAACTCTTG 
\CAAACT TCATATATTC 



ACA/3TTG3CT CA.GA_.VAT 3 AG AACTGGTCAG 
CAGCACTTTG GGAG3CCG.AA GT3GGAGGGT 



CCCTGACCCC 
TGGT3TATAC TTACAGTCCC 
AGG AA.T T C AA G 3 3TGCAGTG 
G C GAG AC C C T G T C T C AAAG 3 
GGAG GTCATC ATCGTCTTTA 
C C C A AAAA G C TTGTGGTCTT 
CTCAATGGGA GAG GAGA 3AA GTAAG 
A C T G AAT AT G CATCCCATGA CAG' 
TCAGTACTGC T 3 T T CAG AG A TTT 
TCTGTTTGGT AA.T AT A C TT C AAA 



TAATTAG3TA ATGTTTTTTT CA 



AA l 

Asn 



CCT 


CCT 


GAT 


AAC 




AAG 


GAT 


AC A 


AAA 


AGT 


Pro 


Pro 


Asp 


Asn 


He 


Lys 


Asp 


Thr 


Lys 


Ser 
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95 






AGA 


AGT 


GTC 


CCA 


GGA 


CAT 


GAT 


AAT 


AAG 


A.TG 


Arg 


Ser 


Val 


Pro 


Gly 


His 


Asp 


Asn 




Met: 
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110 








TAG 


GAA 


GGA 


TAG 


T T T 


CTA 


GCT 


TGT 


GAA 


j-^j- \a 




g:u 


Gly 


C v r 


P h e 


Leu 


Ala 


Cys 


G ] u 


Lys 


120 










12 5 










CTC 


ATT 


TTG 


AAA 


A. A A 


GA:3 


GAT 


GAA 


TTG 


GGG 


Leu 


lie 


Leu 


Lys 


Lys 


Glu 


Asp 


Glu 


Leu 


G 1 y 










I io 










145 


ACT 


GTT 


CAA 


AAC 


GAA 


GAC 


TAG 


CTATTAA 


AATT 


Thr 


Vai 


Gin 


A. s n 


Gl u 


Asp 
















1 5 5 














(2) 


INF 


3>Ri v A 


T I ON 


FOP. 


SEQ 


ID 


NO : 1 4 : 





GAC 
Asp 

CAA. 
Gin 

GAG 
Glu 
130 
GAT 
Asd 



ATC 
He 

TTT 
P n e 

1 1 c; 



ATA 
3 le 
100 
GAA 
Glu 



J GAA ATG 
Glu Met 
85 

TTC TTT CAG 
Phe Phe Glu 



TCT TCA 
Ser Ser 



TCA 
Ser 



AAC CTT TTT AAA 
-.sp Leu Phe Lys 
135 

;CT ATA ATG TTC 
;er He Met Phe 
150 



8776 
833b 
8395 
8956 
9016 
9076 
9136 
9196 
9256 
9316 
9376 
9436 
9496 
9556 
9616 
9676 
9735 
9795 
9356 
9916 
9976 
10036 
10096 
10156 
10216 
10276 
10336 
10396 
10456 
10516 
10576 
10636 
10596 
10756 
10816 
10876 
10936 
10996 
11056 
11116 
11176 
11233 



11281 



11329 



11377 



11425 



11464 



(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28994 base pairs 

(B) TYPE: nucleic acid 

(C) STFAJJDEE'NESS : double 

(D) TOPOLOGY: linear 
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(IX) 



(vi) 



<iX) 



MOL 


ECULS TYPE: Geno 


~ic DMA 




ORIGINAL, SOURCE: 






(A) 


ORGANISM : hu~an 






(F) 


TISSUE TYPE: pla 


centa 




FEATURE : 






(A) 


name /key : 5 1 utr 






(B) 


LOCATION' : 1 . .156 


06 




(C) 


IDENTIFICATION K 


ETHODS : 


E 


(A) 


NAME / KEY : le rider 


peptide 




{ B } 


LOCATION': 150 07 . 


.15685 




(C) 


IDENTIFICATION M 


ETHODS : 


S 


(A) 


I JAM E / KEY : in i ron 






(B) 


LOCATION': 15 6 86 . 


. 17056 




(C) 


IDENTIFICATION M 


ETHODS : 


E 


(A) 


NAME / KEY : leader 


peptide 




(3) 


LOCATION: 17 0 5 7 . 


.1706 8 




(C) 


IDENTIFICATION M 


ETHODS : 


S 


(A) 


NAME/KEY : intron 






(3) 


LOCATION: 17 06 9 . 


.20451 




(C) 


IDENTIFICATION M 


ETHODS : 


E 


(A) 


NAME /KEY : leader 


peptide 


(B) 


LOCATION: 2 04 52 . 


.20468 




(C) 


IDENTIFICATION M 


ETHODS : 


S 


(A) 


I :AME / KEY : na 1 p e 


pt ide 




(3) 


LOCATION*: 20469 . 


".20586 




(C) 


T DENTI FI CATION M 


ETHODS : 


S 


(A) 


NAME/KEY : intron 






(3) 


LOCATION: 20537 . 


.21920 




(C) 


IDENTIFICATION M 


ETHODS : 


E 


(A) 


NAME /KEY : na: pe 


pt ide 




(3) 


LOCATION: 21921 . 


".22054 




(C) 


IDENTIFICATION > 


ETHODS : 


S 


(A) 


NAME/KEY : intror. 






(3) 


LOCATION: 22055 . 


.26827 




(C) 


IDENTIFICATION l 


ETHODS : 


E 


(A) 


NAME/KEY: rial pe 


pt ide 




(3) 


LOCATION: 2 6 528 . 


".27046 




(C) 


IDENTIFICATION r 


ETHODS : 


S 


(A) 


NAME /KEY : 3 ' UTR 






(3) 


LOCATION : 2 7 047 . 


.28994 




(C) 


IDENTIFICATION l< 


:ethods : 


E 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



ACTTGCCTTA 
GTTCAAGAAA 
GTTTAG7AAAT 
TATAAAATAG 
GTGG ACT AT C 
GGACATATAC 
ACATGCTAGA 
CCCAAGAAAC 
TTTCATTTGT 
CACAATCTTA 
AGTAGCTAG 3 
GTTTTCAGA 3 
CGATACTCCT 
GCCTTAAATT 
GAACCAGTAG 
CTTGTGAAAT 
AAG ATC AA 2 T 
AACTATAGTA 
TCATAAAGG I 
AATAGTTTCT 
TATGTTGTAG 



AAA3 0TTTGC 
AATCATTTAA 
ATAAACATTT 
TCCG3AAATT 
TGGCACTGGA 
ATTTTGTTTA 
AAGCATATGA 
ACCTTG OTCA 
TTGTTTTT3T 
GCTCACTGTA 
ACTACAGGAA 
ACAA.TGTATT 
G3CTCAGCCT 
A G A C T T T AAA 
ATGTTTTC AT 
TTTG OTAAAT 

G'j 1 -j j ! j'j : jCA 

CCTA 3TTATC 
ACAGACTCAC 
TGGAGG 2CTA 
GACAATCCTA 



AT AG 3 TAG AC 
GTTATAAJA.^ i 
T ATACATCA. 2 
TCAGA3AAAG 
G ACT AAA T AA 
T T AA G AAAAA 
CTTAGT CATT 
AT AT ATT AAA 
GA 2AA 3TTCT 
G3CTCCTAGA 
CATTOCACCA 
G 2AG 2GTTGC 
CCCAAAGCAC 
TGTGGTTTTA 
A30AATGAA3 
AATATAATCT 
G T AG r AAAA 3 
TTACTTATCA 
TTCTGTCTCT 
TACTTAGTGA 
GCTCTGG3CA 



AACATTAGAT 
AT AA C AAA C C 
CATTTAAATC 
ATGAATCTGA 
AG AAAG C AG G 
GCAAATAAAA 
TGAGTTTTTA 
TTTTATTTTG 
CGCTCTGTCA 
TTCAAGTGAT 
TGCCCAGCTA 
CCAGGCTGAT 
TAGGATTACA 
AACTCCTGTT 
CTAAACTGTA 
TCAAGGGAGC 
ACAGGATACT 
CAGCAAAATA 
AGATCTCAAG 
AAAAGCAGCT 
TACGAATACA 



TAATTTCCTT 
TTCTGCATTA 
TTTCTCCAAG 
TTTTCCAAGA 
TACAGTCAAT 
CATTTTTCAG 
TTATTAAGGA 
GTTTTCAACT 
CCTAGGCCAA 
CCTCCTGTCT 
ATTTTGTTTT 
CTGAAACTCT 
GACATGAGCC 
GAAAAAGCGT 
ATTTAGACAG 
AAATCATGTC 
GTGCTCTTTA 
ATTACATAAA 
CTACCAAAAA 
GGAATCAACA 
TTAAATCCCA 



GCTCACATCT 


60 


TAAGACTGAT 


120 


GCTTCATCTT 


180 


GAGGACAGCT 


240 


AAGATCTTCA 


300 


AAAAAGGCAA 


360 


AATTTACAGG 


420 


AGACTTTGCT 


480 


AGTGTAGTGA 


540 


CAGACTCCTG 


600 


GTTTTGTTTT 


660 


TAGCCTCAAA 


720 


AATGCGCCCA 


780 


CTGGTATCTT 


840 


TAGCCAAATG 


900 


CCAAATGCAA 


960 


AAAGGTCAGT 


1020 


ATCCTATGGA 


1080 


GAAATCTCCC 


1140 


TAGTTCCTCC 


1200 


CTTATCTATA 


1260 



46 - 



• # 

GAGCTTTCTT AAA G G G AA G A AATTTGAGTA GTATGT AAAA CAGAATAAAA GATTAAGGCT 132 0 

CCATAGGCAT ACAGCTTACC TCCAATTCTC TTGGCCTCTT GCAATTTCTA TTATCAGGCT 13 80 

TTACAAGGTG ATTTGCCATC ATATTCCGAA GGCACCAGCT ACAAAGCTTA GAACAAT3CC 14 4 0 

AG ATT TAG GT ACAAA3TCCA TGCTACAAGC TCTCTGGAAT CCTTCCCTGT TTCCCACTCC 150 0 

TACTGCTGAT GTTAATTTAG ACTGT 3ATT.A TCTGTCACTT TCCTAAACTC AATTTCT3CC 156 0 

TCCTCTAAAT CATTCTATCA ACTGCTATTT GGGTAAT CTT TCAAAA2TTT G ATT ACT G 2 A 16 2 0 

TTCCTTTAAC TCAAAAACTT TCATTGTTCG AGAATAAGTT G AAAT T 3 C A T GATATGG 2CT 16 8 0 

TCAAGGTCCT GTATTATCTG GTG 2 AAG 3CT AZTA3TCCCA TCATTTTCAA CTA2TCCTCT 174 0 

CTATGTACTT AGCCAAATGA GTCTCTCTGG CAATTCTGCC TTGTTTGAGG ACTGGCTCAG 18 0 0 

TTAAGATTCT TTTATCTTCG GCCGG 3CGCG CTGGCTCACG GCTGTAATCC CAG 2ACTTTG 186 0 

GGAAGCTGAG G CAG G AAG AT CA2CTGAGGT CGG 3AGTTCG AGACCAGCCT GG C C AGC AT G 192 0 

GTGAAACCCT GTGTCTACTA AAAAT CC AAA C AT T AG C CAG GZGTGGTG3 3 AGGCGCCTGT 193 0 

AATCCCAGCT ACTTGGGAAG CTGAG3TGA3 A 3 AA T C G C T T GAACCCA3GA GAG3GA3GTT 2 04 0 

GCAGTGAGCC GA3ATTGTGC CATT 3 3 A CTC CAG2CTGG3 3 AACAGA3 3GA GACTCCACCT 2100 

CAAAAAAAAA AA33.ATTCTT CTATCTTCAC AAAATCTTAA TGTTTAAACA GGT CTTACAG 2160 

TTCATCTAAT TCAATCTCAT TTTTTACAA3 TG.A3AAAA3A G3 3ACAGT3A CGGT3GATCA 2 22 0 

A 3TGACACC A GT AA 3 A CTG A G3TAAATCA3 AACCGAGATC TCACTCGAGT CTGA3 3TTAT 2280 

TCCCACTGTC CAACCTTACT TT AAA 3T AG 2 TTCAAATTTT ACTTTTACTT TTCCATAAAT 2 34 0 

T CGG. AAG GG A TTTTCCCTAG GAGT 2CAAAT G T T G AAAC C T G G AAG G G T AT A3TCTCTGTG 24 00 

TCTTTGAGAT GAGGGGAGCC CTGT 2CATAT T C AAG T T AT C AATTGACTTT GTTGTTTTTG 246 0 

AG AAACG AT G CTGATTTGGG TAACTTTAA3 ACATCTGTTT GATTAGTCCT A T AAAAT AT G 2 52 0 

CATATATAGA AG AC AG AAAG A 3 C AA C AAC A AA T T T G AAAG ATGCTTGTTA A3TAAATTCT 2 53 0 

GTATCGTACG TGTCCATTCC TG2CAGTACC TTTATAGTAT GTAAGTTTAC GTGCTGTAAT 2 64 0 

AG T A T T AA T A GTATCTA3.AA AATA2TACAC ATG 2 AC AG 2 A GTGCTAACTT TGCCTTG 3GA 270 0 

G T T G G AAAA T ACTTCA3AGA A3CCAACA33 CA3ATTTTTC TCTCTTCCCT TCCCCTT 3TA 2760 

ATTTTCCCTT TCCCCTTCAC CCCCTTCTCT TCT 3TCCCCA AGTAACACTG TGCACCTATG 2 82 0 

T C AAA C G AAA AC T T AT PA T C AA 3 T AA 2 T G T TTCTGCAAAA ATAAGTTCGT TTTCCTGTCA 2880 

TGGCTCAAGG CCTCA3CAGA TCCAGG 3CTG GTG 3ACGGGC TG3TCTTCGT CGTGTGCCAA 2940 

ACACTGACCA CTGCCCTGGC TCTG2CATCT TAG3CTTAGT GACCTGGCTG TTACTAAGCA 3 000 

CTGTCCCCTC TGCCCCATG Z A3 CT3TCTCC TT3TAGTCTT CTCCCTCTTC TCAACGCGAT 3060 

CCTA3CCCCT CAG3 2CATTT CACCTCCATT TTCCCTCACT TCCCGCCG 2C CCTCCGCACT 312 0 

TCCTTCCTAC TGTTGTTTCC G2CCCACTAG A 3 2CCCTCA3 AGAAAGTTTC CATCCTCGCA 3 180 

CCCTTCCTT 3 TGTCACAGCC CGTCA2ATT2 TCA3AG3 23 3 CCATCCCTCC A 3CCCCACCC 3 24 0 

C.AA33CCAAT GTACTTCGCG GTAT333GAC CTT CCTCGT 3 A33GAACGC3 A 33 3AGTGAA 3300 

GACCCTG3G3 GCG3 3GT3CT CGGA.3TTC 3 3 GG3T3 3AG3T G 3 3AAGC 3CG CCGCACTCCC 3 360 

AG CA3CCCCT GCAC3AGTCA CGTGACAGCT CTC 2CACCAC CA2CCCCCCC AACTTCCCCA 3420 

CCGTAG 3CTC CCAGAGCCAG GCCCCACGGA AA3 3CAGCTT TTTCCCGGTT TTCTCCCGCT 34 3 0 

CTTTCCCCTC CACTTGGAAT A CTC CTG AAA CAAAAATCTC TCCCTGCCA 2 CCTGTGTGTG 3 540 

TTTGAACCAG G AAA AAA TCT GAAACTGGTC AA. 3 AAAG AA C AA. 3 3 AAG ACT T G C C AAA. 3 C A 3600 

AG3CCG3TGT GTGTCCCAGC AGCTTA3AAT CT3A3 2AAA3 G.AACACAAAA TAG3ACAT2C 3 66 0 

A2G3CCTCTT TTCGAGTAAA ATTTA2TTG3 TTT 3TTTGCA GGAAGGGTTT AAAACTGC3T 3 72 0 

TTGCAGATG2 TCTGTTTGCA G 3AA 3 3 3TTT AATCACGTGT TCCCCTG 3 3 2 C.A2AAGCAAG 3 78 0 

GCTTTT A3AT C CAG. AG 3 CTC AGTTA3TGCC CCCTCTT 3CT CTTTGGIG2A ACCAAACGTT 3840 

CAG.AATCACG CCTTCTTAGA AAATT 2TTAC CTCG 3GTCTG TTAATAA3TT AAGTCTAATT 3 90 0 

GGC.AACAG3T ATCAAAAAGT GTTG 3 AT. AAC AC AC ATG 3CT CACATAATTG TAGCTTTGCC 3 96 0 

TCATCGGGTG TTTTAATGCG GA3G 3TTTGA CCTG 2AATTT C AAAG AT AT A CATTCCAA3 2 4 02 0 

TTACGCCCA 3 TTAGTGGATG T3GAA3AAAA AAAAAAG C AA ATTACCTCAT AACACAAA3G 4 08 0 

TCAATAA3AC ACATCCATAA G3TCCA3GTA CAAAATCTTA CATCTTAGAG AACTATATTT 4 14 0 

AACATTTACA TACATTACTA AGGT TTTTTT TTTC2TTTT3 CTT3ATTAAA T3TTA3TTAT 4200 

CATTAA3TCT TGGAATTATT CTGT3T3TGT ATATTTATTT G3T3TTTGTG A A 3AAGCC3 3 4260 

TTGTTTT AAA TAA3TT 3CTA G AAAAT AAG Z G 37CAAT 3T 3 TT FAATCTGA GTTGCTAATA 4 32 0 

TT3TGAAATA TAGG 3 C.AC AT AATA3TA3 3C TA3ATAACTA T 3 3CGAAGTA AG3AGTCTCA 4 39 0 

AA.CACTGTCC CAGAACAATA GCAATCTGTG TTGAATTTTT AC 2CTCTGT 3 G T AAAA T G A A 4440 

G3GAAAA3 3A ATGAAGTTTT AG TTT 3 2 CTT AATTTTTATC TTTATTGTTT CAGACTCTTC 4 50 0 

AGCAGTATAA AGTTTTCATC AAGTCAAATA TATTCACTTT AAA 3 T '3 A CTG TG 3TTTATTC 4 56 0 

TGATACCATG TCCTTCCTAA TTTGG 3 3 3GC C.AC 3TGAGAT AAGTTTTATG AAATAAAAAG 4 62 0 
ATTAAAAATT CTTACATTTT TAGTC-TCCTT CC7TGGTAAA AT3TAGAGTT GTCCACTGTG 468 0 
TT TATCTCCT CCTCCTTATT ATCAT Z 3TTG CT 3TTATTAT TTTTAATG 3T TCATTAAACC 4 74 0 
CAAG3 3TCT3 G 3 AAATA.CT Z AT3CAATTCA TCT3ACAGCC TT2ACACTGT A T G AT AT T T A 4 8 00 
AAC.AG3TG3T TGTCCATCT 3 ATT C 7 T AAAA TATTTCCAAG AAAAATGATT C ! 2ACCTAATG 4 86 0 
CAT.AAATGCT TTCATC AGAT TAAGA3AA3A C CAT GG AC AT TT T ATTTTAT TTTATTTTTT 4920 
AAATATTAAC TTCCATTGCA T AAG 3 T AAAT GG 3TAGGAAT AAGTGAG ATG ATATTGTTAT 4 98 0 
CTAGAGCTTT AAAAT ATT C A AAGGG 2TGTC ATCATTATCT CATT T AA TCT TTGAAAACAA 5040 
CTCTATGAAG T AC AAACG AC A C T G AG A C A T TTCTTGCTCT A T A T C AAA 1 3 A AAAAAGTGTT 510 0 
TGTCCCAAAA C T T C AAAAT G T G T AAA T T A C ACATTCTGCA TCTTTACAGC TGG AG AAAAT 5160 
TCACTGGCAA TGGAATATTT AAAATTAGAG CTTG CTTAGT GTGCTG CTTC TGATCACTAC 522 0 
TTGATCCCAC TTCGTGCTTT CATGTTAATT G 3 CC CAATTG GACTCTACAG TTGGAAGGTG 52 8 0 
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AAAACTTACT ATTTCAACTT GA 3T C AC 3TA TGTAT i CT i A T 3 AT AT AC A T C T T AAA. 3 3 T A 5 3 4 0 

CTATTTTTTT TGTTCTGATA GTCA3CA3AC CAAGCA3TTC CA3CCACCCT GCCA 3 AG ACT 5400 

TCCTTTGTAA TCACTGTTGA AGGACAT 3AT GTTTTTATGA CTTCCCGAAA T 3AAAACCCT 54 6 0 

ATCTTGTTTT TAAAACAAAC AA_-.CCAA.CAA AAAGTA3C37 T7A7GTAAGC ATTTT 3TTCC 5520 

CTGACTCTAG GAACCCCTCT GTTTTTATAT CAACTCTG I" A CT3 3CAAAAC ACAAAAACAA 5580 

AATGCCA3CT TGCTAATTCC CTTCCTAG3A AAGTAATA.CA GT7TAGCACA TGTT CAAGAA 564 0 

AAAAATGGCT AAGAAATTTT GTTT3CACTA ATTATTTTCA A3ACTGTGAT ATTTACACTC 57 0 0 

TGCTCTTCAA ACGTTA 3ATT T T AT AAG ACT ATTTTTTAAC AT3TTGAA3A TAAGCCC7AA 5760 

ATATATGTAT CCTTAAATTG TATTTCAAAT ATTTT AGGTC AGTCTTTGCT ATCATTCCAG 582 0 

GAATAGAAAG TTTTAACA3T GGAAAC7GCA A G T AAA T AT T TGCCCTCTTA CCTGAATTTT 58 8 0 

GGTAG CCCTC TCCCCAAG 3T TA3TTTCTGT TGCAGAAAGT G T A AAAA T T A TTACATAAAA 5 94 0 

TTCTAATGAT GGTATCCGTG TG 3CTTG CAT CTGATACAGC AGATAAAGAA GTTTTATGAA 600 0 

AATGGACTCC TGTTCCACT 3 AAAAGTAAAT CTTAATG 3 CC TC-TATCAACT ATCCTTTGAC 6 06 0 

ACCATATTGA GCTTGGGAGG AAG3GGAAGT C C T G AA T G A ' 3 GT7ATAAA3T AAAA G AAAA7 612 0 

AT T TG C AAAA TGTTCCTTTT TTTAAAATGT TAG ATTTT AG AAA.TATTTTA AGTGTTGTAA 618 0 

CATTGTA3GA A7TACCCCAA TA3 3ACTGAT TATTCCG CAT TG7AAAA7AA G AAAAA.G T T T 624 0 

TGTGCTGAAG TGTGACCAGG AAG T C T G AAA ATGAAGAGAG AC A 3ATGACA AAAGAAGATG 63 0 0 

CTTCTAATGG ACTAAGGAGG TGCTTTCTTA AA3TCA3AAA GAGATA CTCA G AAA G A 3 G T A 636 0 

CA3GTTTT33 AAGGCACA3A GCCCCAA3TT TTACGGAA3A AAA3ATTTCA C G AAAA. T A G T 6420 

GATATTACAT T AAAAG AA 3 T ACTCGTATCC TCTG CCACTT TATTTCGACT TCCATTGCCC 64 8 0 

TAGGAAAGAG CCTGTTTGAA GGCGGGCCCA A3GA3TGCCG ACAGCA3TCT CCTCCCTCCA 6540 

CCTTCTTCCT CATTCTCTCC CCAGCTT GCT GA33CCTTTG CTCCCCTGGC GACTG CCTGG 6600 

ACAGTCAGCA AGGAATTGTC TCCCAGTGCA TTTTG CCCTC CTGGCTGCCA ACTCTG3CTG 66 6 0 

CTAAAGCGGC TGCCACCTGC TGCA3TCTAC ACA3CTTCG3 GAAGAG3AAA GGAACCTCAG 672 0 

AC 2TTCCAGA TCGCTTCCTC TC3CAACAAA CTATTTGTCG CAGGTAAGAA ATATCATTCC 673 0 

TCTTTATTTG GAAAGTCAGC CATGGCAATT AG A 3 3 7 AAA 7 AAGC7A.GAAA G C AA 7 T G A.G A 6840 

G G AA T AT AAA C C AT C T A G C A T C AC T A. C G AT G AG C A 3 7 C A G TA.7 C AA C AT A AG AAAT AT AA 6 5 0 0 

GCAAAGTCA3 AGTAGAAT7T TTTTCT77TA TCA3ATAT33 GA.GAGTA7CA C777A.GAGGA 655 0 

GAGGTTCTCA AACTTTTTGC TCTCATGTTC CCTTTACACT AAGCACA7CA CATGTTA3CA 702 0 

TAAGTAA CAT T T T T AA T 7 AA AAATAAC7A7 GTACTTTTTT AA C AA.C AAAA AAAA3CATAA 703 0 

AGAGTGACA3 TTTTTTATTT 77ACAA37G7 TTTAA. CTG 37 77 AA7 A 3 AAG CCATA.7AGAT 714 0 

CTGCTGGATT CTCATCTGCT TTGCATTCAG AC7AC7GCAA TATTGCACAG AATG3AGCCT 72 0 0 

CTG 3TAAACT CTGTTGTACA C7CATGA3AG AA7 3 3373AA AAAGACAAA7 7ACG7CT7AG 726 0 

AA77AT7AGA AATAGC777C AC7TTA.3 3AA C7C 3C7GAGA. A77GCTGCT7 7A.GA37G3TA 7320 

AG AT AAAT AA GCTTCTCT7T AAACGGAATC TCAA3ACAGA A7CAGTTACA TTAAAAGCAA 7380 

A. C AAAAAA T T TGCCCATG3T TAGTCA7 377 GTGAAA7C73 CCA CACC777 GGAC7G33C7 7440 

A. C AAT T ■ 3 3 A T AATATA.GCAT TCCCCGA3AT AAT777CTC7 CA.CAATTAA 3 GAAAGGG3TG 7500 

AATAAATATC TCTGTTTGAA GTTGAATAAC AAAAATTAGG ACCCCCTAAA TTTTA3G 3CT 7560 

CCTGAAATTC GTCTTTTTGC C7ATAT7CA3 CTA C777ACG T73TATTAAA 7C7TC777CA 7620 

G33CA.G373C A.CTA3CTCAT GCCTA3AA7C 7CA3 3 CA3 3 3 C73AGCCCA3 GAAT77GAGA 7630 

CCAGCCAGGG CAACACA.GTC TCTACAAAAA AATAAAAAAT 7ACCTGGGTG TGTTGGTGCA 7740 

T 3 2CTGT AG A AC T AC 1' C A G G AT G C 7 1 3 A 3 G A CTG 3TT 3 A G C C C A 3 G AT AG 3 CAAA7 C7G7G 7 8 0 0 

G7 3AG7TCA3 CCACTAAACA GAGCGA.GACT T T C T C ."AAA. A 7 A 3 AAA. C AAA AAAA C A AA C A 786 0 

AACTTCCTTC AAAATAACTT T77ATC7GCA AT37777CC7 ATTGCCTGTG AGATTAAATT 7 92 0 

TACTCT777A CCTGATTTCC AAAG ZCC7C 3 A.7AA7 CTAAT CCGAC7TTAC C77G7G77CA 7930 

C 7 G C AAA. A 7 A G 3AGGAC7G7 TCCAC7ACAA 7CCAAAAA7C ACA3 377G 3 3 TGCAGTGGCT 8 04 0 

CAC7CC7G7A ATCCCAACAC TTTGGAAGGC CAA3 3CAGGT GGA.T7GC7TC A3CTCAGGAG 810 0 

TTCAAGACCA GCCTGGGCAA CA.TGGCAAAA ACCC7 3TCTC TC 2 AAAA CAT ACAAAAATTA 8160 

G C C AG A7G 7 3 GT AG T ATG TG C CTGTA 37 C C C AA C 7 AC 7 1 A AAA 3 3 C T AA 3 G 3 AA 3 AG GAT 8 2 2 0 

CACTTGAGCC CAGGAGGTCA AGGCTACAGT GA.G CTATGTT 7 A C 7G T G T C A. CTGCACTCCA 82 8 0 

GCCT3 3GTGA TAGAGCAA 3A CCATGT37 3A AAAAA7AAAA AAA 3 AAA AG A AAAGAAAAAA 8 34 0 

AlATCGCTCr ATTCA.GTTCA CCCCCA3CA3 AACA.T7 3777 7GATTATOAC A 7 AAAT G CTG 84 00 

GTCCATTG3C TTCTCTATCT ATTCAAATCT TTAAGCAT7C T77GAGATTC AACTCAATTC 84 6 0 

T 3CTTTTCAA AC TAG- 3 CC AT T 7 AAA. C 7 A 3 A 7CAG77CCA7 777 CAT 7 77 C 7TG 37TTGA3 8 520 

TCTA3A3ACT C AAAAAC AAA AACTTAAAAA CTTAT 773 'i T A A 3 7 TT TCTG CTACTCTCAC 8580 

TTCTTCAACA CTCACATACA C GCATT 3 AT A ATAA 3ATGG 3 A ■ 3 AA TGTT 1 C A AGGATAAAAT 8640 

GA7TTA7AGA ACT G AAAA 3 T 7 A. 3GTT77G A 7377377337 GT CAAGA7GA CTACCTACCT 8700 

GAT CTCAGGT AAT T AAT 1" AT GT AG 3A7G 37 CCCTCA7TTC ATCCCATACC T ATT C AA 3 AG 8760 

GATTGGAATT CCACAGCAAG G A T AAA ■ 3 A. T A AT 3 ATA. 373' 3 CTTTTCAAGT TCAAGGCATT 8 82 0 

TTAACTTT'I A ATCTAGTA 3T ATGTTT 3TT 3 TTG7TGTT37 TG TTTGA 3AT G 3 AG 3C-3TG 3 8S80 

TGTGTCACCC AGGCTGGAGT GCAGTG3 3AC GAA.CTCG 3 37 CACTGCA.ACC TCTGCCTCAT 8940 

GGGTTCAATC AGTTATTCTG CCTCAGTGTC CCAA CTA' 3 37 GG 3ACTACAA GGCACATGCC 9000 

ACCATGCCTG GCTAATTTTT GTATTTTTA 3 TA3AAACA.G3 GCTT C AC CAT G TTG 3CCAGG 9060 

CTGGTCTCGA ACTCCTGACC TCAAGTGATC CA3CCG CCTC GG 3CTCCCAA A 3 TG CTG GG A 9120 

TTACAGGCAT AAGCCACCGT GCCCAG 3CTA ATAGTATGTT TTTAAACTCT TAGTGGCTTA 9180 

ACAATGCTGG TTGTATAATA AAT ATG 3 CAT AAATATTTAC TGTCTTAGAA TTATGAA.GAA 9240 

GTGGTTACTA GGCCGTTTGC CACATATCAA TGGT7CTCTC CTTACAG CTT TAATTAGAGT 9300 



CTAGAATTG 3 
GTCCAAATGA 
CTCATGTGTC 
AAAGTATTG 3 
C CTG 3 j A 1 -i j j 
AG AATTTTAG 
TA3A3TTAAA 
TGGG 3ATGA 3 
ATTGTAA 3 AA 
A 3GG.-AA.-_AT 
TTCCCTCCTG 
AAAAA T AT A T 
ATA3 3AAACT 
AGGTGTTG AA 
GAACAGACCT 
GCATGAACA. 3 
GA33.A3A3AA 
GCA3 3 3TTT3 

agagaataa3 

T A T T AAAA T A 
T 3 3 A. 3 AG _A.G 
AC.^G33.*V-A 
TAACTA3GA3 
TGGAGCCAAA 
CCCTAA3TCT 
AGTTTTCCCC 
TACCCAAACT 
T.AAGCTTATA 
CTTTCT 3TTT 
ACTTGA3:TG 
C C AAG CC AAA 
ATCAGCTGAG 

GT7CAA7GA3 
TTTG 3T 3 3GG 
AAGCAGTTCA 
TTTGTGTTTT 

acctcaa.3 3a 
tgcaca3 3g a 
ttgaccatat 
actctttagt 
aaagtata3 7 

GTTTATAT 3 3 
T AATT T C AAA 
GTGTCTCTTA 
TAAATAAGT 3 

taaataat: t 
g c aaaaag aa 
aaaacaatat 

G' tTTGT * G . A 

aag aaaa gtt 
gggaag 3-;:;.; 3 



CATGA 3 
TGGAG 3TTG 3 
ACTCTGTAT 3 
G AAATG C AT T 
GGCCTCAAAG 
TCTTCTTGGT 
ACA3TCATCA 
TTGTAAATT 3 
TGTGCGAGTA 
GGTCA3CTAG 
A G T T C AAA T A 



AGGTT 3GTA 3 

GGGAA3TGAG 

CTAATTTCG A 

ATAACTAAAT 

TGTTT 3TGA3 

AGGAAATAGT 

T G AAA G AAA. G 

AACACACTA.C 

AATAAGAATG 

G G A G AAG A G A 

TTCTTCCTCC 

ATATTGTAGA 

TTTATTCATA 

AATTCAGGGG 

ATTTTTAATC 

GTGTGAAGGA 

GTTTT 3CTAG 

C AG ; 3 G AAAAA 

TGTGGCAACA 

GAA3ATGCGA 

ATA 3CAAATT 

GATTAGACAA 

ATGTTTTGAA 

TTAAAATTTG 

CTA3TTTGTA 

TGCGATTGTG 

GTTGTACGGT 

TGTAAATTAG 

CGA3TTGTGT 

TATTTTCGTT 

AACTACGAAG 

AAATCGGGCT 

TAATTGTTAA 

GTCCATTCTT 

AGG3T3AAGT 

CCCTCCCGA.G 

G AG T A 3 AG A C 

ATCGA3CCAC 

CCAGATCCAT 

CTTTGTCCAA 

CTG 3TTACAC 

G :aagttaca 

G3AAAATGGA 
GGTGTCAGGT 
TACTTGGTAG 
A. 3AAGTTCTT 
TATATA.TG.AT 
AAAAAAAAAA 
TAAAAAGGCC 
GTAATAATGC 
G 3TTTATGTA 
AACTTTGGCT 
G G A G AT C A : 3 C 
CGCTCAGTA3 
A3CACTTTG 3 
G 3 AG AAAC G G 
ATCGCAGCTA 
A 3JTGAGCCGA 
CAAAAAACAA 
CTTTGTGTAA 
AACCCATATT 
GGAA 3 3TCTT 
CTTTGTGTGG 
AGGTGTTAGG 
GAGTCTGAGT 
TGGCGACATT 
TTGTGCTTCT 



AGCTG3AA3A GAGCTTAAAG 
ACCCTTAAAA TTAAGTGACT 
T C AT G AAA T T CTACCATTCA 



TTTTAT 3TGT 
GTGGTTTATA 
CAAAG3ATTT 
A 3GGATGAG A 
TTGTAATCAG 
AAGAATTCAA 
TTAGAAAAAT 
TTGTCATTG 3 
CATGCGTTTG 
CAA3ATTTAT 
AAAAAAG A G A 
AAAGTAATCT 
G3TAG3ACTG 
A3GAAGTAT 3 
AAAAAAAAAG 
ATGGA3GA 
G 3 3 G T AA T G A 
CTGGA3ACAT 



GTTTTAAAGA 
A G T G T T AAAA 
TTATGCAATG 
AAG : 3 .AA G AG ■ 3 
TCATAGATGT 
AATCAACACA 
TATTCTATTT 
TTTTCAGGTG 
TA3G3TGTTG 



AT^ 

tg: 

CTA 33GT 
ACAAATTGTC 
AAA: AAAA yr 
GAT 3TAT.AAT 
AAAATT 3AGG 
ACT3A3AACT 
T G A- A T AAAA 
TTAAAATTGT 
GAG 3 3AAAGT 
TCAT 3 3 2 AAG 



;ttctta ctg: 



CAACTTC3TT 
AAAA. 3 T G 3 AA. 
lG GCTAGTTGTC 
TC AGTG 3TTA3T 
CAGTAGT3TG 
TTTCATAGAT 
AGAGGAGGAA 
AACAAGAAGA 
AG AAAC T A C T 
GTTTTCAG3G 
TTAAGATG3A 
AAG3TTTATG 
;3TA AGGT3TTTGG 



AGTCATTGTC 
CAATTTAGGG 
TGAG 3AGA.G A 
T AAA 3GTGGGA 
G GC i \'j3G3>A>j 
j-ortoA GTCTGGAAGC 
GGGCTTGATT 
TTCTGAGTTA 



T T AAAA 3 T C A GATGAAAG: 



TAGTAAGAGG 
ATAGTTAG3T 
GTTACGGG AT 
GGGAAGTTCT 
AAGAAAACCA 
T AAAA G AG T G 
G AA C G T A G AG 
AG3TTTCA3T 
TACCATTGGA 
CCCAGCACCA 



TC3 



AGGAGTT AA 3 AATGACTCCC 
AAGTAATGTA TTGGATCTCT 
TACATGTATA TAAGTCTCCC 
GCCAG.ACTTG C T .AAAA G AAT AGTTT 3 
TCCTAG.ATAT TTGTCGACGT ACCAT3 
T G C C C AAAA G TTGCTAATTG CCAAA7 
GAGCTCTACA GTTTGATTTC 
GACACATGTG AG AT T T AC AA 
CTTCTTTGTT GATGAATGAG 
TCATCCTGA3 TTCCTCCTTG 
GTTTAGTATG TCTTG.AATTC 
AATTAGTTGA GTA.GTTGTT3 
TTGTTGTTG3 TG3TGGTG3T GC-TG' 
GCA.GTGGAG 3 AGTTCACTG 3 AACGAGAG 
TAG C TGGG A. G TAGAGGTATG TGCCACCA 
AGGGTTTGAC CATGTTGGTG AGGC 
CTCAGCCTCC CAAAGTGGTG G 3 .AT 
TGTTTATGTT G3TTGTAGAG TG.A3T 
TTTAAGTCAG TATTTTTTTT T7CAG 
AAGGCCTTTG TA 3TCTGACT CTT3T 
TTTTATGTGA ATTGAATTAG G 3AA3G 3 TAT 
AAT AAT G T T A A G T C T T C C AA ATA 3 T T T A T G 
C AAATG AG T T AAAAACTGTT AACACTAATG 
AATTATGTG 3 TTCCATGTCA TTATTATGTA 
C .AG A C .AT A 3 A G 3 TT ATT ATT GTG3TTTTTA 
AATGTTATGG AA3TGCTAAG GGATGTATT 3 
AACTCCAAAT AAATATGTTG AAA 3CAAGTT 
AAAGTACCAC CATAATAGGC TGT3TG "AG A 
TGAGAAA3TT GAAAAAA.GAA. AG .AAA 3 3 AAT 
C T G G A. A 3 AA T AT 3 T C C T C T C AAA 3 T T T T AG 
GG3CG3AGTG GCTCTTGGCT GTA3TGGGAG 
TGAGGTCA3G AGTTTGAGAC GAG 3CTGACC 
TAAAA 3 AAT A CAAAATT3 " ~" 

GAG3 3CGAAG CA3GAAG; 



T 
A 

GAGCAGCCCC 
AACT 3AAGTA 
GTAA3GATTC 
TTCT3TTTGA 
ATTACGTTAA 
3GTGT 
TGAG A 

1 A 

3A3A3G 



TATTTAAGAA 
A G G AAT G AAA 
GAAAG.*-l i G.~l' j 
GGCCTGGGA.G 
AGTAGAAGAG 
CTGTTGGAGA 
TATTTAT3AG 
TTGGGG3AGG 
ACTATGTATG 
GTAA3TACTT 
CACTGTCTTT 
GCCTCCACTT 
T G AA G AAG T T 
TCCTGAAACC 
ATTATTTTAC 
AACAAAT7GC 
CCGA3AACAG 
TTTATAGCCT 
G T G C AAT C T T 
GA.'3TTTCGCT 

tcgtgggttt 
cccagctaat 
caaactc3tg 
catgagccac 
aaacacaaat 

AAAA AA C AG T T C AA 
T C C A AG CTT7CA.TC 
AAG AAT T A.T A 



GTGTG TA'3TA AA AAT AC AAA 
C T C AG : 3 A' 3 G 3 T AAG G 3 AG AG 



GATCGTGCCA 
AA ' 3 A~- -AA-G A 
AATGTGA3TA 
TTCTCG 3TGT 
CCTCTG 3 3GT 
TAACCTTCTC 
TCATTCATTA 
CACTGG AGA 3 
CCTGATGGAT 
CT AAATG GAT 



TTGCACTCCA 
AAGGTAAGCT 
CATTTGCCTT 
CCCCGCTGGC 
TGAAAAT3CC 
CA 3 3 AG CATC 
CATTATTTTG 
TAGGAGCGAA 
AGTTAATGCT CAAA 



G 3GT 

ATTT, 
TAG 3 
TG 3T 

TAAC 
CTCT 



3 3'3 l 3GG 
AGTTGA 
3 3 3GAA 
.GTATGT 
ATG 3TA 
TT TGT 
GTGTT 
3AAAG 
GTTTA 
ATTCA 
GTTAA 



TCACTCATTG A2TAA 3TATG 



CCAGATGCAA 
AATTAGAGTT 
AA3A.TAATTT 
TTACTG 3TGT 
TAT AT 3 AAA 3 
CGGCAGGC7A 
7ATA.T3 3TTT 
GTTTTTTGGA 
C-3 TTTGGG A. 3 
AAAAATGGAG 
TG 3CTTA3CC 
AG 3AGTT3 3A 
CGTGGTG 3T3 
AGGCAG 3 3AG 
CAAG.A33 3AA 
G A. G A T G T T T A 
AAAAT3TTGA 
TCAGATTGGT 
T ; 3 AAG 3 T .AG G 
AATGAAT 3TG 
T.A 3.ATTCTTG 
TGTGTGGTTT 
CTGGTGAATC 
TA.CTCAAAAT 



93-50 
9430 
9430 
9 54 0 
9 6 0 0 
96 6 0 
9720 
9730 
9340 
9900 
9960 
10 0 2 0 
10 0 3 0 
10140 
10 2 0 0 
10260 
10320 
10380 
10440 
10 5 0 0 
10560 
10620 
1C630 
10740 
1 C S 0 0 
10360 
10920 
10930 
11040 
11100 
11160 
11220 
11230 
11340 
114 0 0 
11460 
11520 
11530 
11640 
11700 
11760 
11S20 
11830 
11940 
12000 
12060 
12120 
12130 
12240 
12300 
12 360 
12420 
12 4 8 0 
12 54 0 
12 6 0 0 
12660 
12720 
12780 
12 3 4 0 

12 900 
12960 
13020 
13030 

13 140 
13200 
13260 
13320 



4 J - 



AGTAAACACC 
TTAGTTTGAT 
ATCATAATTC 
CGTTCTCAGT 
CAGCAATGAA 
ATTATTGTTG 
TTATCTTAGT 
CCCTCCTTTA 
TATAAAATTT 
TAAATGTTTA 
GG GATG3TG 3 
TGAGCTCAG 3 
TATGAAAACT 
T AAG AT G G G A 
GAGATCGTG - 
AAATGTTATC 
TAAACCTAGA 
AGCACTTT3 3 
TTGGGGTTA 3 
C TAG CAT AG A 
ACTCTCCTG 3 
TACCAACACA 
GTAATGAGTT 
GTATTTCTAT 
AACTAATCAT 
TCCCCATCCA 
GAAGCCTG :A 
TTTTTTTCAA 
G T TA.AAA.2lT A 

agggccagca 

AAG C AC T T T T 
GCCTCTCA3C 
GGGCACGATT 
AT T TAT A T X T 
ACCCTTTATA 
GTAAGAGAAC 
CAAGATTGTT 
AGTAGTA 3 1 A 



AGTAATT i AA 
AAATATGTGT 
A3AACT3ATA 
TAATGCAGTC 
GGCATAAAGG 
TGATTTTTAA 
GATCTTTTCA 
AAAAA 2 T AG G 
AATATTTATA 
T T T AAAT T C A 
CTGACAG 3TA 
AGTTTGAGAC 
TATCT GGGTG 
GAATCGCTTG 
CACTGCACTC 
T AAATAA 3 AT 
T AAAA C T AT C 
G AG G 2 C AAAA 
CTTAATT -AG 

tg:tagg:ct 
ataccattta 
tt aaat aat a 



TAAA 



GT 



G G G AG G AA T G 
AGTTCACAGA 
CATC 3TATTG 
ATCCTGTATC 
T AG T AT AA T C 
AATG3CA3A3 
TTATCATTGT 
C AT A 3 G A. A G C 
AGATGTGTTT 
CTCAAG 3AAT 
ATATTGTTTA 
A C T T AAA CAT 
CATC A GT ATA 
AAAG3TA3AA 



TCCAA. i i CCT 


GCCCATACTG 


GT ATT AC AT A 


AT ATTAAAGT 


A 2 T AAT C AAA 


CATAAATGCT 


ATTAACAAAC 


ACCTTCTGAT 


T 1 3 .AA T AAA G A 


A3ATGCCCTC 


TATTAATGTT 


GGTTTGGGTT 


TCCTTTGCTA 


T ATT ATTTTT 


TGATAATTCT 




AT AT TT AAAT 


GTTTGATAAA 


TTTGTACATC 


A3TTTTTATT 


TAATCCCAGA. 


ACTTTGAGA 3 


CA.CCCT 3 3 3 C 


AA GGTGGTGA 


TGGTGGCA3G 


CATCTGTGGT 


AACCCAGGTG 


A jA'jGG'-jT'JJ 


CAACCT'jG GT 


G>\ ,nGn jTu.n 


AAATTTAATA 


AGTGTTCGCA 


AAATAAGG 2C 


TG 3GTACAGT 


TTATAC AAA G 


TTAGTTGTAT 


AT T AAT T T T T 


TTTAAACTGA 


CATATTAT G 3 


TAG AAAAAT T 


ATTTTT AAA 2 


AAATTTTAAT 


AGATCTACTG 


TGAGGACTAA 


CTCAAGATCT 


C T G C AAT AA 3 


G T C A C AA G T A 


C i A'_ 1 AA I h.-. 


C T AAA. T T T C A 


G A. G 3TTG 3TG 


TTTCTCACAG 


TGAGAACTCC 


GAAGAAGG3 3 


AGAAAGGAAA 


CCTTGATTTT 


ACAGCAAGAT 


GAGACTCCAG 


AT C AAAAAT C 


CCCTTCAGCC 


TTCTCCCACC 


TTCACCTTTA. 


GCATTTTGA3 


CA3T3 3TT3T 


AA GA^GAAGG A. 


AT3 3 AAT3CT 


CA3 3TGAAAC 


A 3 T A 3 AAT A A 


tgattc-'aGt r 


ATG AG AAAAT 


GCTTTCTATC 


GG 3TATTCAT 


AAGGTGAAAC 


CTAGGTGATT 


AACTGACCA3 


tcattgt:t3 


TTAACAGAT3 



CTCAGGTTAA 
GCTG ATAAT A 
GTGG AGCTC A 
TTG GTG AAAA 
CTC i AAG AG T 
TTTAAATATT 
TATTTAAATT 
TTATTTAAAT 
GCCAAGTCAG 
AACCCTGTCT 
CCCAGATG3G 
GGTGGATGTT 



GACT 3CATCT 
CTTA3A.T3AG 
GA3T 3ATGCC 

_ iH. _ '_ /"^^ - i 

GTTTTAAATT 
TTGA3CACAG 
GC AG TAT ATA 
ATTTCTGTAA 
T G T A 3 C AC AA 
TACT 3TG GTT 
AAAAT AC ATG 
T A T T C C AT AA 
A G C AAA T G G A 
TGTATTTATG 
A3CG3AGCTC 
CT3 3 3TGTA 2 

TATATTTATA 
C AG TACT AG 3 
T T C C AAAT AT 
TTCI 33 GATT 
TGAACTTGAA 



TTCAGGTGAA 

GATCATGCTA 

CAAATGTCTG 

GGGCCTTGTT 

CAGCCTAGTC 

AT 3TTTAGAC 

CTTCCTTATC 

AT AAAT AG CT 

T TAT AAT ATT 

GTGTTGGCCA 

GCAAACCATT 

CTACCAAACA 

A3TCCCA3GC 

GCAGTGA3 3T 

C AAAAAAAAA 

C AT AAG G AA C 

TGTAATCTCA 

AAC AAC T AT T 

CCTGCTTACT 

ATTTATGAA.T 

TGTGCCTTTT 

TTTCAAAGTA 

C A G AAAAT AG 

TGTTTCCTGC 

TGTATTTTTT 

CAAAATTCTG 

AGGATTTAAA 

TATTACTTGT 

A G G G AG AAA G 

C CTC AG AT GG 

CAAAC CTTCA 

CCTTC CTAAA 

TTG AC AAT AT 

CCATTTATCT 

CTGATGATTT 

TATTGAATGT 

G 3TAGTATAA 

TTCATTAGGA 



13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14750 
14820 
14830 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15430 
15540 
15 6 0 0 



ATAAAG ATG 


GCT GCT GAA 




A. '3 AC AAT T 




TTT GTG GCA 


15651 


Met 


Ala Ala Glu 


Fro Vai Gl 


u Asp Asn C 


ys He Asn 


Phe Val Ala 






-35 


-3 


o 


-25 






ATG AAA TTT 


ATT GAG AAT ACG CTT T 


AC TTT ATA. 


3 GCAAGGC TAATGCCATA 


15702 


Met Lys Phe 


lie Asp Asn Thr Leu T 


yr Phe He 








-20 




- 1 5 




- 1 0 






GAAC AAATA 3 


CA 3GTTCAGA 


T AAAT C T A T T 


CAATTAGAAA 




AGGTGAACTA 


15762 


TTAAGTGACT 


CTTTGTGTCA 


CC AAAT TIC A 




A_AT 3 3 3TCTT 


AAAAAAATAG 


15822 


TGGACCTCTA 


G AAAT T AA 3 C 


ACAACA.T3T C 


CAAGGTCTCA 


3 3A3CTTGTC 


ACACCACGTG 


15832 


TCCTGGCACT 


TTAATCA3GA 


GTA3 3TCACT 


CT3CA3TTGG 


CAGTAAGTGC 


A 3ATCATGAA 


15942 


AATCCCA3TT 


TTCATGG GAA 


AATCCCAGTT 


TTCATTGGAT 


TTCCA.TG 3G A 


A AAA. T C C C A G 


16002 


TACAAAACTG 


GGTG 3A.TTCA 


GG AAATA 3AA 


TTTC 3 C AAAG 


CAAATTG3CA 


AAT T AT 1 3 T AA 


16062 


GAGATTCTCT 


AAATTTA3AG 


TTCCGTGAAT 


TA 3 A. 3 "ATTT 


TAT 3 FAJAATA 


T G T T T G A C AA 


16122 


GT AAAAAT TG 


ATTCTTTTTT 


TTTTTTTCT G 


TT 2 22 3AGG G 


3 GGA GTG 3 A G 


3 3GCA.CAA.TC 


16132 


TCTGCTCACT 


GC AAC CTC GA 


CCT 22T 3 GGT 




3T 3 3T 3 GCTC 


A 3 3 CTT GTG A 


16242 


GTAGCTG GGA 


ctagaggtg: 


AT 3CC 3 3 GAT 




TTTTTGG3; FA 


TTT TT ACT A 3 


16302 


AG AC AG G GTT 


TTGGGATGTT 


GTGCA3 3 3TG 


GTCTTGGACT 


CCT 3 AT CTC A. 


GATGATCCTC 


16362 


CTGGCTCGG G 


CTCCCAAAG4T 


GCTGG 3 ATT A 


CAGG 3ATGAA 


CCACCAC A'„ A 


TGGCCTAAAA 


16422 


ATTGATTCTT 


ATGATTAATC 


tcctgtgaa: 


AAT TT'GGCTT 


CATTTGAAAG 


TTTGCCTTCA 


16482 


TTTGAAA 3CT 


T C AT T I 1 AAAA 


GCCTGAGGAA 


CAAA3TGA3A 




ACAAAAAA3T 


16542 


GCAAAATATC 


ctgtggaca: 


CTCCTA2 2TT 




i ■ j ' j'j.n 


G 3 AT C AC TTG 


16602 


AGCCTAGG AA 


TTTGAG3CTG 


CAGTGAG2TA 


TGATC2CACC 


CCTACACTCC 


A3C3T3 3ATG 


16662 


ACAGTAGACC 


CTGACACA3A 


C AC A 3 AAAA A 


AAAA- C 3' T T ■ 3 A. 


TAAAAAATTA 


TTAGTTGACT 


16722 


TTTCTTAGGT 


GACTTTCCGT 


TTAAG CAATA 


AAT T T AAAA 3 


TAAAATCTCT 


A AT T T T AG AA 


16782 


AATTTATTTT 


TAG TT AC AT A 


TTGAAATTTT 


TAAACCCTAG 


GTTTAAGTTT 


TATGTCTAAA 


16842 


TTACCTGAG A 


ACACACTAA3 


TCTGATAAG 3 


TTC ATTTTAT 


GGG 3CTTTTG 


G ATG ATT AT A 


16902 


TAATATTCTG 


ATGAAAGCCA 


A3ACAGACCC 


TTAAACCATA 


AAAAT AG GAG 


TTCGAGAAA3 


16962 


AGGAGTAGCA 


AAAG T AAAA 3 


CTAGAATGAG 


ATTGAATTCT 


3 A 3 T 3 G AAAT 


AC AAAAT TTT 


17022 



- 50 - 



AC AT AT TO TG TTTCTCTCTT TTTCCCCCTC 



A. '■ 3 AAA T G . AA T 
ATTGTCAGCT 
GTGGACTCAG 

cta/3caaaag 
aga3tttgc3 
ggagcgttg: 

CCTCTCAGAA 
AGAAAATTCT 

ccaaaag:tg 

AG AG T AGO- AG 
CGTTCTCT3A 
AG C A C AAT AT 
AC A. 3TA 3CT 3 
AAG TAG TAT 3 
ACCCATCATC 
GTTCATATCC 
TCTAATTGGA 
ATG3CCACAC 
GACTCTTTAA 
AATTGATG3T 
T T A CAA 3 AAA 
GTG AATG CAT 
GAAAGGAGAA 
T T C A 3 T G AAA 
G 3 ATG AC : j 3 A 
CAAA C T C T T C 
TTGTAGGCTT 
GAAATGTGTA 
AAA3 3ATGCA 
A3AAAAACCA 
TGTTGTGTTG 
CAA3AAGGA3 
GAT T G C AAT 3 
AAAGCTTGG3 
A.T ; j ' 3 AG C A G i 
GTGTTAATTG 
AAAAACC.AC 3 
TATGTATATA 
TAA3 3TG 3TT 
CA3TTATATT 
TCAATTCTTA 
TAA 3CATT 3C 

-rv jj ji.-. j 

GTA 3.AGG 3T A 
TCATTTCCCC 
TA3GTGGA 3T 
CCTGAAGAGA 
CTGGAAAACA 
TGTGCCATT 3 
AAA T A A G T T A 
GCTG 3ACA3 3 
GACTG 3 AA 3.A 
TA3AAG3 3TA 
GTAGTTCT3A 
GGCAACTTTA 
A AAA T A T A T A. 
TTAATGTTTA 



TTATTTTTCT 
G AG G AAA AAA 
TAGCACAGCT 
ATGCTTCTCT 
TGTTTCATTG 
AT AG G AAAAA 
ATGCTTTGGG 
GG AAG TAG AG 
AAAGAAACCA 
TAGGAGACTG 
CCCCAAGATG 
GTATTAG CTA 
AATAAGATAG 
GCTGGAAGCA 
CCCTA3TTGT 
CAGTTATCAA 
AGTTAAACAC 
CAAGTGCAAG 
AA T T C AG AAA 
GTbj : j ; jT : jA 3 
TGATGGTGTC 
TTTAGAAACA 
G AAAG TAG AA 
TAG 3G AACAC 
TTTCGTTTTG 
TACATGTG3T 
AT AC ATA 3 AA 
AAG 1 GAGAGA 
G T A G AAAG AA 
AGAGAATTCC 
AATTTTGCAG 
TTT 3GTGAT 3 
A3TG.AAA3A3 
TGTTAAAA 3 3 
TTTAAATCTC 
TAACTAATTG 
CAGAAATAAT 
T A C A G AC A C A 
TTTCTA3TTA 
AAGTTCATG.A 
ATTGGTGAAT 
TGTACACCAA 
AATTGCTGG 3 
CATGCCGAGC 
TTG3 3ACAAA 
A 3CAGG AAAT 
GC AAAG AAA A 
AAA 3 AA 3 TAT 
TTCA 3CAGC A 
G G G A TTT AA T 
G.AA j jAA j jj 3 
TGTTGTGAT A 
AGAGA 3 C3GT 
ATG 3TAACAA 
TTGTAGCTAC 
G CAT ACT TAT 



TTGCA" 
AAAAA 1 
TTGGA: 
ATGCCl 

gtcct; 



A3TA 
GGTT 
.A G AA 
TAAA 
AGAT 



TT-: 



1 A' j 



AA 
3lu 



GGGATTGAAG 
AAGAAGCCTG 
G AG ATA.G 3 AA 
TGGCATTTAT 
GTvjA 1 j j A' j 
T G AAAT T T G G 
G G G TAAA G AT 
AGAATTTTTC 
A 3CTGAT 3 AT 
TGATCT 2 ACT 
GAAAGG 3TCA 
ATCAATCCCC 
GAAATCT 3GA 
TAATATATTT 
GAGG 3 3 A- AAA 
AT G AAT T AAG 
CTTGAGA3AA 
AAGATGATGC 
AG GAGG AAG A 
GATCT 3.A3AT 
TCTGAGTTCA 

1 " j j „ 1 j. J 

G3AAAA3 2CA 
G C T AA T .AAA C 
ACCGA3T2CC 
CCTTGA3AAT 
T3AGT3AAAG 
TGAAT33 3.AA 

AAAAT.AAA3A 
AGGCAAT3AA 
GATAG 3TA.CC 
G AAA T G 3 T T A 
GTGATATATA 

tattt:a3.aa 

GTTTGTT ICC 
TG TTCA 3 AC A 
TTG AT AG AAG 
A 2 AAAT G 3 3 A 
A 3GGAGA 3 AG 
CATTTG 3TG A 
T TTG AAA TTG 
T 3 AC AATTGG 
GC.ACTCA 3 3 A 
ATCCT 3 2CCA 
AA3AC3 3G.AA 
TTT AA ' 3 AA 3 A 
C AG AAA 3 TAG 
GAAATT 3 A 3T 
TTCTG3.AACA 
TT3TCAATTA 
AAC 
Asn 



TTAG CT 
Ala 
- 10 
AGTATCTGCT 
CTCATGCTAC 
G ATG AT CAT A 
AAATTCT 3CA 
TAGCATGAAG 
CATTAGAATT 
GAAGGTTCCG 
* i 'j'j'j'j - 
GATGAATTCA 
CTAGAGTGAT 
A3TTTATCTT 
TAGTTTGTTG 
T C T C AAAG AA 
ATTGGGA 3CC 
CACATAGTT 3 
A3AGAAGTCA. 
CTCATATTC 
AAAT AT AAT 
TTAAAATAT 
TTAAG3GTT 
G TAG A CAT AG 
TCAATA3GA2 
CTACATTTTT 
GCAGGTTTT 3 
GTCTGTG 3AA 
GGACACA3AT 

j-^-i. 1 -~ l.-vi .-^ JJ.~\ 

A 3 T AC T 3 TG 3 
AAC A. 3 AG A 3 3 
A. G GAG AG 3 A.T 
CAAG333CA3 
CA 3 C TTG AT 3 
G T G AA. 3 AAA T 
CAAGACTAGC 
G3TTTGT3CT 
AAAA GAT AAT 



G AA 
Glu 



GAT 
A30 



GAT 
Asp 



G 

Glu 



17075 



AGACA3AT 
ATCTG 3CT 
A. GAT A 3 AA 



ATTTTGATAC 
TATTTTTATT 
T3 3A3AT3T3 
TAA 3 3 3 3. AT. A 
A 3 TTT 3TTGT 
TTTGTCTGAT 
A 3 AAAG 3 ATG 
TCAGCCCTA 3 



( „ j j _ i A i ^ 
TATG 2 TT 3T A 
GCCAA 3TTTG 
AATG3TA 3 AC 
G.AGGTTA 3AT 

G 3T.ATTTGAA 
GTG 2CT TTG A 
G AAGATTGTC 
A C AAAG A. AA C 
CT 3 GAA TCA GAT 
Leu Glu Ser Aso 



G3TCTTA3AA 
CCATGGATTC 
GTCCAAAATC 
G3TTGGTGGT 
AA G AA G A C C A 
G 3GTAATTCA 
AAA 3AGG 3TG 
G3AGATAATA 
TAAC AAA 3AC 
A3TCTAAGTA 
CCAACCTTCT 
AAAATCATCA 
G-3 2TCATTCC 
ATTGACTAGA 
TTTATTCCAG 
ATTCTGGCTT 
CTATT 



a j.- 



GGG AGTGCTG 
A.TGATTTAGG 
CACTTAGGCA 
GTGTATACAA 
CGTCCTAGTC 
TTGGGCTGGA 
G .AT AAAAA 3 A 

A3ACTAACCA 
TTCAAGATTC 
AACACAGCT'j 
GTG AAA T G G A 
GATACAGATA 
T3 2AAAGTGA 
TTTTTGATTA 
A AT ATG A AAG 
AATATTTCTA 
AAAAGGGATT 
T 3CATGGCAA 
TCTTTGCC CT 

tgttattaa: 

TTTTTCTTCA 
ATTGCCAAAT 
ATACCAGAAA 
TG3TGCCAGA 
AGAGATGATG 
CTATG3AA3A 
AT 3GC.ACCGA 
GA3TTTTGTA 
AA.AATGAAGT 
AG 3. AAAT AC A 
GA3TAAAAGT 
GTTAACACTT 
GAGTAG 3TTA 
ATTAATA3TT 
TATGTATTTT 
TA3 TTT G' 

Tyr Phe Gly Lys Leu Glu 



CTATCTCACC 


17135 


T C AAA G AAA T 


17195 


A G AAG AAC C T 


17255 




17315 


TGTTGTAGGG 


17375 


AGTAACACCT 


17435 


GGGGTGGGGC 


17495 


CATTCAGAGG 


17555 


GAATGGAAGT 


17615 


TAGAGCAAGA 


17675 


GGGTTAATTA 


17735 


AT CC AAAG AT 


17795 


GGCAGCTCAG 


17855 


TCAGTCTTGT 


17915 


TACTTCCTGG 


17975 


TTTCAAA3AC 


18035 


ATTTAATCAC 


18095 


GTAGCCATAT 


18155 


TGGTATAAAG 


18215 


ATTTTAGTTA 


18275 


ATGAGGAGCT 


18335 


GTTGGATTTG 


18395 


ATTTGTA3CA 


19455 


AGAGGAGGAT 


18515 


GAGATGTCCA 


18575 


GATAGAGATA 


18635 


C ACATC A 3 AG 


18695 


ACCTACATTT 


18755 


AAAGGGGA3A 


18815 


A'jT''J : JjGA i A : j3 


18875 


TTAGATTTAG 


18935 


GGCAGAG 3CA 


18995 


ATTCTTG 3TA 


19 0 5 5 


GATTGG 3TTG 


19115 


T G AAAAT AAT 


19175 


A T AAAAAT A. T 


19235 


CACTCCTTTC 


19295 


GTACTATACC 


19355 


CGAGTAATTG 


19415 


TTTTATTTAA 


19475 


AATGTTCCCA 


19535 


G GAT AAAA CC 


19595 


TAAAGCTTCA 


19655 


TGGCACTTTC 


19715 


GTTAAATTTG 


19775 


TTGTAGAGGT 


19335 


GA.GTGCT 3AA 


19395 


TTTGAACG 3T 


19955 


G 3AG AAA 3 AC 


20315 


CTGAGATCCA 


20075 


AGAGTCAGGA 


20135 


G T AAG AAAA C 


2 019 5 


C A G .A G G C .A G A 


2 0255 


AAAAA C AA T A 


20315 


T TAG AAAAC T 


2 0 3 7 5 


T.AAATGAGAT 


20435 


AAG CTT GAA 


20486 















5 








1 








5 




TCT 


AAA 


TTA 


TCA 


G'i C 


ATA 


AGA 


AAT 


TTG 


AAT 




CAA 


GTT 


CTC 


TTC 


ATT 


Ser 


Lys 


Leu 


Ser 


Val 


lie 


Arg 


Asn 


Leu 


Asn 


Asp 


Gin 


Val 


Leu 


Phe 


lie 








10 










15 










20 






GAG 


CAA 


G 3 A 


AAT 


CGG 


CCT 


CTA 


TTT 


GAA 


GAT 


ATG 


ACT 


GAT 


TCT 


GAC 


TGT 


Asp 


Gin 


Gly 


Asn 


Arg 


Pro 


Leu 


t he 


Glu 


Asp 


Met: 


Thr 


Asp 


Ser 


Asp 


Cys 



20534 



20582 



51 



25 30 35 

AGA G GT ATTTTTTTTA ATTCGCAAAC AT A G AAAT G A CTAGCTA 3TT CTT 
Arg Asp 
40* 

TGTTTTACTG CTTA 3 ATTGT 



TGACAATAAT 
AAAATTGGAT 



TTCACTTACA 
ACAATAAGAC 



ACCAATCCCT TTATTGTGAT TGCATTAACT 



C A G AT G AAAA GTCACA3 3AG 

AA 

GTCATGCCTC TCTGAG 3 CTG CCTTTGAATC 
GTTTAAAACC TCTATAGTTG GATG 3T7 AAT 



TCCGTGCTAG TCCCAATCC 

GGAAACTTTA TAAGGCATCC ACGTTTTTTA GT7G3 
ATTGCTAG 3 3 



CCCTGCTTGT TACAGCTGAA AATGCTGATA GTTTACCAG 3 TGTGGTG 3CA TCTATCTGTA 



ATCCTAGCTA CTTGGGAG3C 



AAGCAGGA GGATTGCTTG AGGCCAGGAC TTT 3AGGCTC 



TAGTACACTG TGATCGTA CC TGTGAATA 3 3 CACTG 3ACTC CAG 3CTGGGT GAT AT AC AG A 

CCTTGTCTCT AAAAT T AAAA AAAAAAAAAA AAAAAA C C T T A 3G AAAG 3 AA ATT 3ATCAAG 

TCTACTGTGC CTTCCAAAAC ATGAATT 3CA AATATCAAAG TTAGGCTGAG TTG AAG 3AGT 

GAATGTGCAT TCTTTAAAAA TACTGAATAG TTACCTTAAC ATATATTTTA AATATTTTAT 

TTAGCATTTA AAAG T T AAAA ACAATCTTTT AGAATTCATA T G T T T AAAAT ACTCAAAAJAA 

GTTGCAGCGT GTGTGTTGTA AT A C A C AT T A AA GTGTGGGG TTGTTTGTTT GTTTGA 3ATG 

CAGTTTCACT CTGTCACCCA GGGTGAA3TG CAGTGCAGTG CAGTGGTGTG ATCTCG 3CTC 

ACTACAACCT CCA3GTCCCA CGTTCAAGG 3 ATTCTCATGC CTCAGTCTCC CGAGTAGGTG 

3 CATGCACCAC TTACACCCGG CTAATTTTT 3 TATTTTTAGT AGAGGTGGGG 



GG ATT A' 



TTTCACCATG TTGGCCAG3C T3 3TCTCAAA CCGCTAACCT CAAGTGATCT GCCTGCCTCA 
GGCTCCCAAA CAAACAAACA AGGCCACAGT TTAATATGTG TTAGAACAGA CATGCTG 3AA 
CTTTTATGAG TATTTTAATG ATATAGATTA TAAAAGGTTG TTTTTAACTT TTAAATG GTG 



GG ATT AC AC 



CATGAGCCAC TGTGCCAC 



CTGAACTGTG TTT T T AAAAA TGTCTGACCA 



GCTGTACATA GTCTCCTGCA GACTGGCCAA GTCTCAAAGT GG G AA C AG GT GTATTAAGGA 



CTATCCTTTG GTTAAATTTC 
TCATTTATTA TATTTATTTC 



AGT ATG TAT AAA 

Ser Met Tyr Lys 
50 

GTG AAG TGT GAG 



CGCAAATGTT CCTGTGCAAG AATTCTTCTA ACTAGAGT7C 
AG AT AAT GCA CCC CGG ACC ATA TTT ATT ATA 
Asp Asn Ala Pro Arg Thr lie Phe lie He 
4 0 4 5 

GAT AGC CAG CCT AGA GGT ATG GCT GTA ACT ATC TCT 
Asd Ser Gin Fro Arg Glv Met Ala Val Thr He Ser 

6 0 6 5 

A ACT CTC TCC TGT GAG AAC AAA ATT 7^77 
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AAA ATT T 

Val Lys Cys Glu Lys He Ser Thr Leu Ser Cys Glu Asn Lys He He 

7 0 7 5 8 0 

TCC TTT AAG GTAAGACTG AG CCTTACTT TGTTTTCAJAT CA7GTTAATA TAATCAATAT 
Ser Phe Lys 

AAT TAG AAAT AT AA. C AT TAT TTCTAATGTT AA TAT A A GTA ATG T AAT TAG AAAACTCAAA 

TAT CCT CAG A CCAACCTTTT GTCTAGAA.CA G AAAT AAC AA GAAGCAGAGA ACC A7 T AAAG 

TGAATACTTA CTAAAAATTA TCAAACTCTT TAG CT ATTGT G AT AA T G AT G GTTTTTCTGA 

G3CTGTCACA GGGGAAGAGG AG AT AC AAC A CTTGTTTTAT GACCTGCATC TCCTGAACAA 



TCAGTCTTTA 
ATGTGACTTT 



TAG AAAT AAT AATGTAGAAT ACATATGTGA GTTATACATT TAAGAA7AAC 



CCAGAAT 3 AG 



CTACACCTTT GTAAATTATG 



i i 

at; 



'GCT ATG AAGA.ATGAA.G 
lTATTTT AATCC CTAGT 



CTAATTATCC TTCTATA.TTT 
TGTTTTGTTG CTG ATC CTTA 



rTAAGTCT TAGACACAAG CTTCAGCTTC CAGTTGATGT ATGTTATTTT TA.ATGTT.AAT 



C 7 AAT T G AAT 
AGG TAT AAAG 
TATTTTTCTC 



AAAAG TTATG AG AT C AG 3 7 G 
TATTTCT 3GC CTCTACTTTT 
TATTTCCTCC A 7 7 AT T G 7 7 A 



TGAGCCAGTA AG AG TAG C C A GG 3 ATG CTTA 



TAAAA 37 AAT GCTATAA.TTA TCTTC.AAGCC 

TCTCTATTAT TCTCCATTAT TATTCTC7AT 

G .AT AAA C C A 3 AA T T AA C T A T AGCTACA 3AC 

C AAAT T G G C A ATGCTTCAGA G G A. G .AAT TCC 



ATGTCATGAA GACTCTTTTT GAGTG 3 AG AT TT3 3CAATAA ATA7CCGCTT TCATGCC^ 



ccagt<: 



CTGA.AAGACA G 7 T A G G AT AT 



:ta 



CCCTCA 
AT TCT A 3 ACC 
GTAGGGTGGA 



GGTAGG 3 .AG A AAAAAGCCAC 

GATACA3 3CC CC AG AC AAAT 

GGT G A ■ 3 AAT T T G G AG T C C < 3 C 

/•lAGAGG 3CT 3 GGAT'j 3.AAGG 

T A AT T A 3 AA- 3 GG AAG 3 A 3 AT G 3 C C AAG 

G3AGAG33AG ATTCAGAAAC TG 3GATAAGT 

AGACTG 3TGA AAA TG T T . AA ■ 3 

AAT C CAT ATT TGGGGGAGCC 

AATGTGGCTG GGCGTGG7G 3 



G.ACCT7'AGTG 
A_AT C 3 A AG T .A 
TCTCCCTCCA 
TGA 3AGGCA3 
AAGG3TTAA.G 



AAGGTACCAA GGGG 3AA 377 

AG AAC AG TG C AT AT 1 3 3 AAC A 

A 3CAGAGTGC C ACC CCT 7 C A 

CTTA G T 7 AT C A. AAAT AG 3 AT 

C A T G ' 3 T \. j T T A CTG .AA C A A ' 3 A 



37C AA3 3TATGTG GGATAGAGGA AAAC7CA 3 ' 



CCGAACCTAC 



A. 1 -. jA 1 l j ■ j .A* t-Hl 

T-3AAGTTTAT 
C7CACACC7G 



'< 3TGG.-i.TTC TTGT7 3A3GG 

TAATG3TTG3 C ACT TAG TAG G AAC7GGG 3 A 

TC.AATTTTGA 7G3CCCT7TT AAAT .AAAAA G 

TAA.TCCCAG 3 ACTTTG JGAG GCCGAGGG 3G 



GGGG ATC ACC TGAAGTCAG 3 AGTTCAAGAC CAGCCTGACC AACATG 3AGA AAC C' 



C TACT AAAAA TACAAAATTA G3TG3GCG7G 
G GAGGC7GAG G C AG ■' 3 AG A A T CTTTTGAACC 



GT 3 3 CAT AT 3 CCTGT.AATCC CAGCTACTv 



CG-: 



AGGTTG3GAT GAG CCT A 3 AT 



CGTGCCATTG CACTCCAG 3C TGGG 3AACAA GA GCAAAACT CGGTCTCAAA AAAAAAAAAA 

AAAAA GTG AA ATT AA C C AAA GG3ATTAGCT TAATAATT'IA 

GGGGGGTG3C TGGAAGAGAT CTGTGTAAAT 

AT CAT AG CAA ATCTGCTTCT G 3AAGGAACT 



ATACTGTTTT TAA3TA3 3 3C 

GAGGGAATCT GACATTTAAG CTTCATCAGC 

CAA.TAAATAT TAGTTGGAG G GGG3GA3A3A 

GTGAGGGGTG G ACT AGG ACC AGTTTTAG 3C CTTGTCTTTA ATCCCTTTTC CTGCCACTAA 



20633 



2 0 6 9 3 
20753 
20313 
20373 
20933 
20533 
21053 
21118 
21178 
21238 
21258 
213 5 8 
21418 
21478 
21538 
21558 
216 5 8 
21718 
21778 
21838 
21858 
21545 



21557 



:045 



22103 

22163 
*"> 3 
2 2 2 8 3 
22323 
22403 
22463 
22523 
22533 
22643 
22703 
22763 
2 2 3 2 3 
22383 
22943 
23003 
23063 
2 3 12 3 
23 183 
2 3 2 4 3 
23303 
23363 
2 3 4 2 3 
23483 
23543 
2 3 6 0 3 
23663 
23723 
23783 
23843 
23903 
23963 



TAAGGATCTT AGCAGTGGTT ATAAAAGTGG CGTAGGTTCT AGATAATAAG ATACAACAGG 
CCAGG2A3AG TGGCTCATGC CTATAATCCC AGCA3TTTGG GAGGGCAAGG CGAG7G7CTC 
5AGTTCAA GACCAGCCTG GGGAGCATGG CGATAGTGTG TCTCTAGTAA 



AAAAAATAC 



VTT? 



3CC AGGGATGGTG GCATGCACCT GTAATCCCAG CTACTCGT( 



GCCTGAGGCA GAAGAATCGC TTGAAACCAG GAGGTGTA3G CTGGAGTGAG CTGAGATGGG 

ACCACTGCAC TCCAGCCTGG GCGACAGAAT GAGAGTTTGT CTCAAAAAAA G AAAAAG A T A 

CAACA3GCTA CCCTTATGTG CTCACCTTTC AGTGTTGATT ACTAGGTATA AAGTCCTATA 

AAGTTCTTTG GTCAAGAACC TTG AC AAG AC TAAGA3GGAT TTGGTTTGAG AGGTTAGTGT 

CAGAGTGTGT TTGATATATA TACATATACA TGTATATATG TATCTATATC CAGGCTTG33 

CAGGGTTGGG TCAGACTTTC CAGTGCACTT GGGAGATGTT AGGTCAATAT CAACTTTCCC 

TGGATTCAGA TTCAACCCCT TCTGATGTAA AAAAAAAAAA AAAAAAGAAA GAAATCCCTT 

TCCCCTTGGA GCACTCAAGT TTCACCAGGT GGGGGTTTCG AAG7TGGGGG TTCTCCAA 3 3 

TCATTG GGAT TGGTTTCAGA TCCATT7GCT ATGTACCTT3 CGTATGATGG CTG3GAGT3 3 

TCAACATCAA AACTAGGAAA GGTAGTGGGC AAGGATGTCC TTAOCTCTAT TCT3AAAT3T 

GCA.ATAAGTG TGATTAAAGA GATTGGCTGT TCTACCTATC CAGAGTGTCG CTTTCAACTG 

TAACTTTCTT TTTTTCTTTT TTT; - TTT <pTT TCTTTTTTTT TGAAACGGAG TCTGGGTGTG 

TCGCCCAGGC TAGAGTGCAG TGGCACGATC TCAGCTCA3T GCAA3CTCTG CCTCGCGG 3T 

TCA2G 3CATT CTCCTGGCTC ACCCTCCCAA GCAGCTGG3A 

GCCCA33TAA TTTTTTGTAT TTTTAGTAGA GAGGGGGTTT 

TCTCGATCTC CTGAACTTGT GATGCGGCGG CCTCAGCCTC CCAAAGTGCT GGGATTAGAG 

GCGTGAGGCA TCGCACCCGG CTCAAGTGTA ACTTTCTATA CTG3TTCATC TTCCCCTGTA 

ATGTTAGTAG AGCTTTTGAA GTTTTGGCTA TGGATTATTT 

CAGATTAGTT CGAAATTGAT GGCGAGAGCT TAGGGTCTGT 

ACAGGTGCAG AAGTGGGTGC CAATAGGGGA AGTAGTTTAT AGTTTGATCA ACTTAG3ACC 



CTAGAG3CGC CTGCCA2CAT 
CACCGTGTTA GCCAGGAT3 3 



C T C ATT T AT A CATTAGATTT 
TCCTAAATTG TAT ATT 3T AG 



cac; 



P3TT GATAAAGAAC AAAG3TCAAG 



AGTGATTGCA CAAGTGATTG 
CTTTGAGGCA TCTTTGAAGG 



AGTTATGAG 

AGAAGTTGGA GATAACGGCG TGAGGTCTG 3 CATGCAGAG 

ATGAAGAAAT GGTATTTTAA TTTTGGAGGT TTGTCTATCA GTG 3TTAGGA TGATG3 3AAT 

CTGTG CTGCC ATGAGGCCAA AATTAAGTCC AAAACATCTA CTG3TTCCAG GATTAA3ATG 

GAA3AAGGTT AGGTGGTGCC CACATGTTGT GATCCATCCT G 3 AAAA T AG A CATGCTGGAG 

TAAC A 3 3 AAA AGTGCAGGCA G3AGTAGCAG TTGGATAA33 T33AAGATTA TAGTTT3AA3 

TAATCTAACC ATTTCTCACA AGGCCCTATT CTGTGAGTGA AA CAT A G AAG AATCTG 3ATT 

TGGCCTTCTA AG GCAGGGCC C AG G C AAG G A GACGATATT3 A. 3 3 A 3 AG .AAA TTC AAG ACT A 

CTATG3AACT GGAGTGGTTG GCA33GAAGA CAGA3TCAA3 GA2T3CCAAC TGAGCCAATA 

CAG GAG3GTT A3ACAGGAAC CGAG 3GGGTA G 3CCTACAA 3 AA7TA7TGGG TCTATT 3ACT 

GTAAGTTTTA ATTTCAGGCT C G A G T G AAAG AGTAAGCTAA GATT 3GTGGC ACTTTCTGTC 

TCTGTCAGAG TTGG CTCAGA AATGA 3AACT GGTGAGGCGA G3CATGGTGG CTTACA 3 CT 3 
GAATCCCAGC AGTTTGGGAG GCCGAAGTGG GAGGGTGA3T 
CAGGTTAGGC AACAAAG7GA GATACCCCCT GAGCGCTT3T 
AATTA3CCAA ATGTGGTGGT G TAT ACT TAG A3TCCCAGCT 



3CCCAGGA A! 



: AAGG G T G 3 A 3 T G AC 



GGGATTGGTT 

CTGGCTGGGC AAG AG AG GGA GA3GCTGTGT C AAA G C AAAA 

GCCTAAGTTT GTGGGAGGAG GTGATGATGG TGTTTAGCGG 

CAGAAATTGA CATTAGGGCA AAAA3 3TTGT G3TCTTTG3T 

GCAAATGTGG ACACCACTCA ATG 3 3AGA3G A G A G AA ' 3 T AA GCTGTTTGAT GTATA3 3 33 A 

AAA. G T A 3 AG G CCTG 3AACTG AATAT 3CATG CGATGAGA 3 



T3AG3CCAGG AGTTCAG 3A3 

CTALAAAAAT AAA T T T T AAA 

A 37CA3 3AGG CTGAG33AGG 

AT3ATTTCAC CAGTG 3A3TT 

A 3 AAAAA,GAA ACTAGAACTA 

T3AATGGTTA TTATA3A3GA 

G3AA3TCTAC TTAAT 3TTGA 



AGGAGGTGAG TACTG3TC 



CAG AG ATT TT 



GCAAAACTAC TTTTGTTGTG TTTG 3TAATA TACTTCAAAA 
TGTTCATGTC CTGAAATAAT TAGGTAATGT TTTTTTCTCT 



CGI 
Pr: 



G. 
A 

gt:; cca 



AAG ATG AA 1 " 
Asn lie Lyi 



GAT AC A AAA ACT GAC ATC 



G AG A A T AG G A GATTCGGAGT 
7TTTATGTAA CTCTTGAG AA 
CAAACTTCAT ATATTCAAAT 
ATAG G AA ATG AAT C3T 
Glu Met Asn Fro 
9 5 

ATA TTC TTT CAG AG. A 



Asp 



AGT 

Ser Val 
105 

GAA GGA 
Glu Gly Tyr Ph 



GGA CAT GAT 
Pro Gly His Asp 



Thr Lys Se r Asp 
q 5 

AAT AA3 AT 3 C AA 

Asn Lys Me: Gin 



110 

TAC TTT CTA GCT TGT GAA AAA GAG 



Leu Ala Cys Glu Lys Glu 



lie lie Phe Phe Gin Ar- 

1 0 0 

TT" 
I" he 
115 

AGA GAG 
Arg AjD 



ATT TTG AAA AAA 



12 : 

GAC 



13 0 

GAT GAA TTG GGG GAT AGA 



AA TGT TCA TCA TA3 
lu Ser Ser Ser Tyr 
120 

CTT TTT AAA CTC 
Leu Phe Lys Leu 
13 5 

ATA ATG TTC A.CT 



Asp Arg S 

14 5 

T AG CT ATT AAA ATTTCATGCC GGGCGCAGTG GCTCACGCGT 



lie Met 
150 



Phe Thr 



lie Leu Lys Lys Glu Asp Glu Leu Glj 
14 0 

GTT CAA AAC GAA GAC 
Val Gin Asn Glu Asp 
155 

GTAATCCCAG CCCTTTGGGA GGCTGAGGCG G 3 CAG ATC AC 
CCAGCCTGAG CAACATGGTG AAACCTCATC TCTACTAAAA ATACAAAAAA TTAGCTGAGT 



CA3AG 3TCAG GTGTTCAAGA 



24023 

24033 

24143 

2 4 2 0 3 

24263 

2 4 3 2 3 

24333 

2 4 4 4 3 

2 4 5 0 3 

24563 

24623 

24633 

24743 

24803 

24863 

24923 

24533 

25043 

25103 

25163 

25223 

25233 

25343 

25403 

25463 

25523 

25533 

25643 

25703 

25763 

25823 

25833 

25943 

26003 

26063 

26123 

26 183 

2 6 2 4 3 

26303 

26363 

26423 

26433 

26543 

26603 

26663 

26723 

2 6 7 3 3 

26839 



2 6 8 3 7 



2 6 935 



2 6 9 3 3 



: 7 0 3 1 



27087 



27147 
27207 
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GTAGTGACCC ATGCCCTCAA TCCCAGCTAC TCAAGAGGCT GAGGCAGGAG AATCACTTGC 2 7267 

ACTCCGGAGG TGGAGGTTGT GGTGAGCCGA GATTGCACCA TTGCGCTCTA GCCTGGGCAA 2 7327 

C AAC AG 2 AAA ACTCCATCTC AAAAAATAAA AT AAAT AAAT AAACAAATAA AAAATTCATA 2 73 87 

ATGTGAACTG TCTGAATTTT TATGTTTAGA AAG AT T AT G A GATTATTAGT CTATAATTGT 2 7447 

AATGGTGAAA TAAAATAAAT ACCAGTCTTG AAAAACATCA TTAAGAAATG AATGAACTTT 2 7 507 

CACAAAAGCA AAC AAA C AG A CTTTCCCTTA TTTAAGTGAA TAAAATAAAA TAAAATAAAA 2 7567 

TAATGTTTAA AAAAT T C AT A GTT7GAAAAC ATTCTACATT GTTAATTGGC ATATTAATTA 2 76 27 

TACTTAATAT AATTATTTTT AAATCTTTTG GGTTATTAGT CCTAATGACA AAAGATATTG 2 7687 

ATATTTGAAC TTTCTAATTT TTAAGAATAT CGTTAAACCA TCAATATTTT TATAAGGAGG 2 7 747 

CCACTTCACT TGACAAATTT CTGAATTTCC TCCAAAGTCA GTATATTTTT AAAATTCAGT 2 78 07 

TTGATCCTGA ATCCAGCAAT ATATAAAAGG GATTATATAC TCTGGCCAAC TGACATTCAT 2 7867 

CCTAGGAATG CAAAGATGGT TTAATATCCT AAAAT CAATT AACATAACAT ACTATATTAA 2 7 927 

TAAAGTATCA AAACAGTATT CTCATCTTTT TTTCTTTTTT CACAATTCCT TGGTTACACT 2 7 987 

ATCATCTCAA TAGATGCAGA AAAAGCATTT GACAAAATCC AATTCATAAT AAAAATTCTC 2 8 047 

AAACTTGAAA GAGAACATCA TAAAG 3CATC TATGAAAAAC CTACAGCTAA TATCATACTT 2 8107 

AAC GAT G AAA AACTGAATTA TTTTACCCTA AG AT C AAG AA TAATGCAAGC ATGTCAGCTC 2 816 7 

TTGCAACTTC TATTCAACAT TGTACTGGAG GTTCTAGCCA GAGCAACCAT ACAATAAATA 2 8 227 

AAAATAAAAG GCACCCAGAT T AG AAAG G AA GTCTTTATTT GCAGACAACA TGGTTCTTTA 2 828 7 

TGCAGAAAAC CGTCAGGAAT ACACACACAT GTTAGAACTA ATAAGTTCAG CAAGGTTGCA 2 8 347 

GGTTGCAATA TCAATATGCA AAAAT AC ATT GAAGGCTGGG CTCAGTGGAG ATGGCATGTA 2 84 07 

CCTTTCGTCC CAGCTACTTG GGAGGCTGAG GTAGGAGGAT CACTTGAGGT GAGGAGTTTG 2 8 467 

AGGCTATAGT GCAATGTGAT CTTGCCTGTG AATAGCCACT GCACTCGAGC CTAGGCAACA 2 8 527 

AAGTGAGACC CCGTCTCCAA AAAAAAAAAT GGTATATTGG TATTTCTGTA TATGAACAAT 2 8 587 

GAATGATCTG AAAACAAGAA AATTCCATTC ACGATGGTAT TAAAAAAATA AAATACAAAT 28647 

AAATTTAGCA AAAT AAT TAT AAAAC T TG T A CAT CG AAAAT TTCAAAGCAC TCTGAGGGAA 2 8 707 

ATT AAA 3ATG ATCTAAATAA TTGGAGAGAC ACTCTATGAT CACTGATTGG AAAATTCATT 2 8 767 

CAATATTGTT AAG AT AAC AA TTGTCCCCAA ATTGATGCAT GCATTCAATT TAGTCTTCAT 2 88 27 

CAAAATTCCA GCAGGGTTTT TGCAGAAATT GACAAGCTGT AC CC AAAAT G TATATGGAAA 288 87 

TGAAAAGACC C AG AAG AG CA AATAATTTTT T AAAAA C AAA GTTGG AAAAC TTTTACTTCC 28947 

T AAT T T T AAA ACTTACTATA AA 2 C TAAAG T TATCAAGACC ATTTAGT 28 9 94 

(15) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 10 anino acids 

(B) TYPE : anino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: N- terminal fragment 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Tyr Phe Glv Lys Leu Glu Ser Lys Leu Ser 
1 5 ' 10 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 7 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CCATCCTAAT ACGACTCACT ATAGGGC 2 7 



(2) INFORMATION FOR SEQ ID NO: 17: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 7 
TTCCTCTTCC CGAAGCTGTG TAGACTGC 



(2) INFORMATION FOR SEQ ID NO : 1 8 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 
CTATAGGGCA CGCGTGGT 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 28 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE : cDNA 

(xi) S EQ'UENCE DESCRIPTION: SEQ ID NO: 19 
TTCCTCTTCC CGAAGCTGTG TAGACTGC 



(2) INFORMATION FOR SEQ ID NO : 2 0 : 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 3 0 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDSDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 0 
GTAAGTTTTC ACCTTCCAAC TGTAGAGTCC 



(2) INFORMATION FOR SEQ ID NO : 2 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 1 
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GGGATCAAGT CGTGATCAGA AGCAGCACAC 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 2 
CCTGGCTGCC AACTCTGGCT GCTAAAG CGG 



(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 3 
GTATTGTCAA TAAATTTCAT TGCCACAAAG TTG 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECTJLS TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 2 4 
AAGATGGCTG CTGAACCAGT AGAAGACAAT TGC 



(2) INFORMATION FOR SEQ' ID NO : 2 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1:1 base pairs 
(3) TYPE: nu::leic acid 

(C) STRANDEDNSSS : single 

(D) TOPOLOGY, linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 5 
TCCTTGGTCA ATGAAGAGAA CTTGGTC 



(2) INFORMATION FOR SE^ ID NO : 2 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNSSS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 
CCTGGAATCA GATTACTTTG GCAAGCTTGA ATC 



(2) INFORMATION FOR SEQ ID NO : 2 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 7 : 
GGAAATAATT TTGTTCTCAC AGGAGAGAGT TG 



(2) INFORMATION FOR SEQ ID NO : 2 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 1 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 8 
GCCAGCCTAG AGGTATGGCT GTAACTATCT C 



(2) INFORMATION FOR SEQ ID NO : 2 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 9 
GGCATGAAAT TTTAATAGCT AGTCTTCGTT TTG 



(2) INFORMATION FOR SEQ ID NO : 3 0 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 0 
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GTGACATCAT ATTCTTTCAG AGAAGTGTCC 



(2) INFORMATION FOR SEQ ID NO : 3 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 1 : 
GCAATTTGAA TCTTCATCAT AC G AAGG AT A C 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 2 : 
TCCGAAGCTT AAGATGGCTG CTGAACCAGT A 



(2) INFORMATION FOR SEQ ID NO : 3 3 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 32 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION: SE 2 ID NO : 3 3 
GGAAATAATT TTGTTCTCAC AGGAGAGAGT TG 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 4 
ATGTAGCGGC CGCGG CATGA AATTTTAATA GCTAGTC 



(2) INFORMATION FOR SEQ ID NO : 3 5 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 33 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 5 
CCTGGAATCA GATTACTTTG GCAAGCTTGA ATC 
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